====== Introduction ====== * Member : 蔡昀達, 廖其忻 * Meeting : ====== Member ====== ^Name^Mail^ |蔡昀達|bb04902103@gmail.com| |廖其忻|cayon.1318.96@hotmail.com | |尹聖翔| | ====== public dataset ====== ISCX-URL-2016(https://www.unb.ca/cic/datasets/url-2016.html) kaggle -https://www.kaggle.com/antonyj453/urldataset -https://www.kaggle.com/aktank/url-detection -https://www.kaggle.com/deepak730/finding-malicious-url-through-url-features Phising URLS - Phishtank https://www.phishtank.com/developer_info.php - Open Phis https://openphish.com/ SPAM URLS - JWSPAMSPY http://www.joewein.de/sw/blacklist.htm Malware URLS (These three does not update for a long time) - DNS-BH - http://www.malwaredomains.com/wordpress/?page_id=66 - https://www.malwarepatrol.net/my-account/ - http://www.malwaredomainlist.com/ Benign URLS - Majestic - https://majestic.com/reports/majestic-million Other Source - https://zeltser.com/malicious-ip-blocklists/ ###### Black list - EtherAddressLooup - https://etherscamdb.info/api - Bambenek consulting - https://osint.bambenekconsulting.com/feeds/ - firehol - http://iplists.firehol.org/ - Spamhaus and DShield - http://www.squidblacklist.org/downloads/drop.malicious.rsc - squidguard - http://www.squidguard.org/blacklists.html - blackweb - https://github.com/maravento/blackweb ====== Intelligence website ====== White List - https://www.alexa.com/topsites - Chrome外掛軟體 Googel WOT plugin ##### - http://whois.domaintools.com - https://www.urlvoid.com - https://www.ipvoid.com/ - https://www.apivoid.com/api/domain-reputation/ ##### Black List (目前沒有可利用的資料) - (主要是查IP)https://www.abuseipdb.com/ - (主要是Domain Name,僅參考)https://www.riskiq.com/platform/architecture/internet-data-sets/passive-dns/ - (黑名單太少,參考用)https://otx.alienvault.com/ ====== Meeting ====== ===== 09/16 ===== - check basline CNN model has high accuracy - build tfidf model - build explain feature - triggered pattern - malicious url family matching - phishing url survey ===== 07/31 ===== - Collect training data - Collect whois information - manual feature - model desgin ====== Reference ====== ^Year^Venue^Title^Link^Assign^ |2017|arxiv|Malicious URL Detection using Machine Learning: A Survey|[[https://arxiv.org/pdf/1701.07179.pdf|PDF]]| | https://hackmd.io/RKXNLcvUQY2a-cQAESigqw?view