spam-bot-3000

授权协议 Readme
开发语言 Python
所属分类 应用工具、 IM/聊天/语音工具
软件类型 开源软件
地区 不详
投 递 者 谢泉
操作系统 跨平台
开源组织
适用人群 未知
 软件概览

spam-bot-3000

A python command-line (CLI) bot for automating research and promotion on popular social media platforms (reddit, twitter, facebook, [TODO: instagram]). With a single command, scrape social media sites using custom queries and/or promote to all relevant results.

Please use with discretion: i.e. choose your input arguments wisely, otherwise your bot could find itself, along with any associated accounts, banned from platforms very quickly. The bot has some built in anti-spam filter avoidance features to help you remain undetected; however, no amount of avoidance can hide blatantly abusive use of this tool.

features

  • reddit
    • scrape subreddit(s) for lists of keyword, dump results in local file (red_scrape_dump.txt)
      • separate keyword lists for AND, OR, NOT search operations (red_subkey_pairs.json)
      • search new, hot, or rising categories
    • reply to posts in red_scrape_dump.txt with random promotion from red_promos.txt
      • ignore posts by marking them in dump file with "-" prefix
    • praw.errors.HTTPException handling
    • write all activity to log (log.txt)
  • twitter
    • maintain separate jobs for different promotion projects
    • update user status
    • unfollow users who don't reciprocate your follow
    • scan twitter for list of custom queries, dump results in local file (twit_scrape_dump.txt)
      • scan continuously or in overwatch mode
    • optional bypassing of proprietary twitter APIs and their inherent limitations
    • promotion abilities
      • tweepy api
        • follow original posters
        • favorite relevant tweets
        • direct message relevant tweets
        • reply to relevant tweets with random promotional tweet from file (twit_promos.txt)
      • Selenium GUI browser
        • favorite, follow, reply to scraped results while bypassing API limits
      • ignore tweets by marking them in dump file with "-" prefix
    • script for new keyword, hashtag research by gleening scraped results
    • script for filtering out irrelevant keywords, hashtags, screen names
    • script for automating scraping, filtering, and spamming only most relevant results
    • relatively graceful exception handling
    • write all activity to log (log.txt)
  • facebook
    • zero reliance on proprietary facebook APIs and their inherent limitations
    • Selenium GUI browser agent
    • scrape public and private user profiles for keywords using AND, OR, NOT operators
      • note: access to private data requires login to authorized account with associated access
    • scrape public and private group feeds for keywords using AND, OR, NOT operators

dependencies

  • install dependencies you probably don't have already, errors will show up if you're missing any others
    • install pip3 sudo apt install python3-pip
    • install dependencies pip3 install --user tweepy bs4 praw selenium

reddit initial setup

  • update 'praw.ini' with your reddit app credentials
  • replace example promotions (red_promos.txt) with your own
  • replace example subreddits and keywords (red_subkey_pairs.json) with your own
    • you'll have to follow the existing json format
    • keywords_and: all keywords in this list must be present for positive matching result
    • keywords_or: at least one keyword in this list must be present for positive match
    • keywords_not: none of these keywords can be present in a positive match
    • any of the three lists may be omitted by leaving it empty - e.g. "keywords_not": []

<praw.ini>

...

[bot1]
client_id=Y4PJOclpDQy3xZ
client_secret=UkGLTe6oqsMk5nHCJTHLrwgvHpr
password=pni9ubeht4wd50gk
username=fakebot1
user_agent=fakebot 0.1

<red_subkey_pairs.json>

{"sub_key_pairs": [
{
  "subreddits": "androidapps",
  "keywords_and": ["list", "?"],
  "keywords_or": ["todo", "app", "android"],
  "keywords_not": ["playlist", "listen"]
}
]}

reddit usage

usage: spam-bot-3000.py reddit [-h] [-s N] [-n | -H | -r] [-p]

optional arguments:
  -h,	--help		show this help message and exit
  -s N,	--scrape N	scrape subreddits in subreddits.txt for keywords in red_keywords.txt; N = number of posts to scrape
  -n,	--new		scrape new posts
  -H,	--hot		scrape hot posts
  -r,	--rising	scrape rising posts
  -p,	--promote	promote to posts in red_scrape_dump.txt not marked with a "-" prefix

twitter initial setup

<credentials.txt>

your_consumer_key
your_consumer_secret
your_access_token
your_access_token_secret
your_twitter_username
your_twitter_password
  • create new 'twit_promos.txt' in job directory to store your job's promotions to spam
    • individual tweets on seperate lines
    • each line must by <= 140 characters long
  • create new 'twit_queries.txt' in job directory to store your job's queries to scrape twitter for
  • create new 'twit_scrape_dump.txt' file to store your job's returned scrape results

twitter usage

usage: spam-bot-3000.py twitter [-h] [-j JOB_DIR] [-t] [-u UNF] [-s] [-c] [-e] [-b]
                          [-f] [-p] [-d]
spam-bot-3000
optional arguments:
 -h, --help		show this help message and exit
 -j JOB_DIR, --job JOB_DIR
	                choose job to run by specifying job's relative directory
 -t, --tweet-status 	update status with random promo from twit_promos.txt
 -u UNF, --unfollow UNF
                        unfollow users who aren't following you back, UNF=number to unfollow

 query:
 -s, --scrape		scrape for tweets matching queries in twit_queries.txt
 -c, --continuous	scape continuously - suppress prompt to continue after 50 results per query
 -e, --english         	return only tweets written in English

spam -> browser:
 -b, --browser          favorite, follow, reply to all scraped results and
                        thwart api limits by mimicking human in browser!

spam -> tweepy api:
 -f, --follow		follow original tweeters in twit_scrape_dump.txt
 -p, --promote		favorite tweets and reply to tweeters in twit_scrape_dump.txt with random promo from twit_promos.txt
 -d, --direct-message	direct message tweeters in twit_scrape_dump.txt with random promo from twit_promos.txt

twitter example workflows

  1. continuous mode
    • -cspf scrape and promote to all tweets matching queries
  2. overwatch mode
    • -s scrape first
    • manually edit twit_scrape_dump.txt
      • add '-' to beginning of line to ignore
      • leave line unaltered to promote to
    • -pf then promote to remaining tweets in twit_scrape_dump.txt
  3. gleen common keywords, hashtags, screen names from scrape dumps
    • bash gleen_keywords_from_twit_scrape.bash
      • input file: twit_scrape_dump.txt
      • output file: gleened_keywords.txt
        • results ordered by most occurrences first
  4. filter out keywords/hashtags from scrape dump
    • manually edit gleened_keywords.txt by removing all relevent results
    • filter_out_strings_from_twit_scrape.bash
      • keywords input file: gleened_keywords.txt
      • input file: twit_scrape_dump.txt
      • output file: twit_scrp_dmp_filtd.txt
  5. browser mode
    • -b thwart api limits by promoting to scraped results directly in firefox browser
      • add username and password to lines 5 and 6 of credentials.txt respectively
  6. automatic scrape, filter, spam
    • auto_spam.bash
      • automatically scrape twitter for queries, filter out results to ignore, and spam remaining results
  7. specify job
    • -j studfinder_example/ specify which job directory to execute

Note: if you don't want to maintain individual jobs in separate directories, you may create single credentials, queries, promos, and scrape dump files in main working directory.

facebook initial setup

  • create new client folder in 'facebook/clients/YOUR_CLIENT'
  • create new 'jobs.json' file to store your client's job information in the following format:

<jobs.json>

{"client_data":
	{"name": "",
	"email": "",
	"fb_login": "",
	"fb_password": "",
	"jobs": [
		{"type": "groups",
			"urls": ["",""],
			"keywords_and": ["",""],
			"keywords_or": ["",""],
			"keywords_not": ["",""] },
		{"type": "users",
			"urls": [],
			"keywords_and": [],
			"keywords_or": [],
			"keywords_not": [] }
	]}
}

facebook usage

  • scrape user and group feed urls for keywords
    • facebook-scraper.py clients/YOUR_CLIENT/
      • results output to 'clients/YOUR_CLIENT/results.txt'

TODO

  • Flesh out additional suite of promotion and interaction tool for facebook platform
  • Organize platforms and their associated data and tools into their own folders and python scripts
  • Future updates will include modules for scraping and promoting to Instagram.
  • 代码不可能是全完美的,动态网页在实用中难免会遇到sql注入的攻击。而通过nginx的配置过滤,可以很好的避免被攻击的可能。SQL注入攻击一般问号后面的请求参数,在nginx里用$query_string表示 。 一、特殊字符过滤 例如URL /plus/list.PHP?tid=19&mid=22' ,后面带的单引号为非法的注入常用字符。而想避免这类攻击,可以通过下面的判断进行过滤。 if ( $

  • How to build a Bot Trap and keep bad bots away from a web site KLOTH.NET - Trap bad bots in a bot trap How to build a Bot Trap and keep bad bots away from a web site Block spam bots and other bad bots

  • 我的环境:python3.5 1. 进入pycharm终端,安装ATLAS(或者BLAS) sudo apt-get install libatlas-dev 2. 安装ATLAS sudo apt-get install liblapack-dev 3. 下载spams(我下载的是2.6.1的最后一个) 4. 解压并安装 tar zxf spams-python-v2.6.1-2018-08-0

  • 第一轮反击是,在排序时更多地依赖网页正文而不是meta tag。但spam们以进为退,在正文中大量使用与网页背景相同颜色的关键字,在图片注释文字中塞进关键字,在网页代码加入“看不见的注释”。搜索引擎又开始第二轮反击,它们找到了有效的方法来过滤这些看不见的文字。 搜索引擎的spider(蜘蛛)在访问任何网站时都会自报身份,并且查阅网站的访问规定,按照各网站的规矩来办事。于是,SEO随之制作两个网站,

  • [root@gby /usr/local/bin]# cat /usr/local/sbin/spamctl #!/bin/sh # description: the spamassassin daemon case "$1" in   start)     echo "Starting the spamassassin daemon (spamd)..."     /usr/local/bin/

  • Spam最初是SPAM,一个罐装肉的牌子。对于这个牌子名字的来源有很多解释,官方版本说,它是”Specially Processed Assorted Meat”特殊加工过的混和肉。这种SPAM肉有段时间非常普及,到了无处不在,令人讨厌的程度,后来(1970年)Monty Python剧团有个很流行的Sketch comedy(一种短小的系列喜剧)叫Spam,剧中两位顾客试图点一份没有SPAM的早

  • 原文地址 http://bbs.netpu.net/viewthread.php?tid=3587 问题提出 最近发现几个用户的论坛遭遇Memberlist SPAM攻击,在处理的时候搜索了一下,网上几乎没有关于这个方面的内容,于是撰文一篇,简单做一下介绍。 何谓Memberlist SPAM? Memberlist SPAM是指SPAMER通过大量注册论坛、留言本、Blog等交互系统的用户,并在

  • 判断条件比较蛋疼 #include<iostream> #include<string> #include<cstring> using namespace std; int main() { char str[1000]; while(cin.getline(str,1000)) { int n = strlen(str); for(int i = 1;i < n - 1;++

  • Description You never had any friends, and don’t really want any anyways, and so you have decided to collect email addresses from web pages for direct e-mail advertising. The text delivered to a web b

  • Spam Filter Rules Spam Filter Principle http://www.zdnet.com.cn/developer/webdevelop/story/0,3800067013,39358741,00.htm

  • Spam (垃圾邮件) <?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /> 垃圾邮件已经成为邮件系统中最大的问题。每天在全球传递的邮件中有超过 75% 是垃圾邮件。在一些组织中,垃圾邮件甚至占到其邮件流量的 95%. 全球的垃圾邮件总量在 2006 年里翻了一倍,而在 Internet 内,每 1

  • 题目要求输出所给字符串中合格的邮箱地址,要考虑的判断条件比较多,算是比较复杂的模拟题吧。 #include <iostream> #include <stdio.h> #include <string.h> using namespace std; char data[520]; int length; bool isvalid( char c ) { if ( c<='z'&&c>='a

  • 缝缝补补又三年,衣服如此,软件也如此。        我们前面已完成sap安装,可是我们还漏掉一点没有做,那就是打补丁。打补丁比起升级那可麻烦多了,而且打补丁操作前,我们还先得找到需要的补丁。 1.将补丁解压后的pat文件拷贝到服务器上的/usr/sap/trans/EPS/in目录下(就是我们先期放语言包的那个目录); 2.运行spam(SPAM是垃圾邮件的意思,不懂sap为何用这个做事务代码)

 相关资料
  • Description This jQuery plugin turns an obfuscated e-mail address into a human-readable one. It's lightweight and accepts multiple filtering levels for additional security. No Spam works on mailto lin

  • 除了WordPress平台的Akismet插件以外,Stop-spam也是一个轻量级的选择。这款垃圾信息过滤插件可以与主流博客和论坛程序相兼容 (WordPress,PhpBB,MovableType等),安装也很简单。它会自动将一些垃圾信息发送者的域名和IP地址列入黑名单。当然,你也可 以编辑黑名单和白名单。

  • Anti-Spam SMTP Proxy Server (ASSP) 系以Perl语言写成,是一套开放源码、跨平台且功能强大的垃圾邮件过滤软体.在邮件进入到MTA (如: Sendmail、Exchange)前利用ASSP过滤病毒、垃圾邮件,可减轻Mail Server的负担.由于ASSP的强效,使用前须谨慎调整相关设定,以免无法收信或误删重要邮件.

  • bot

    A multi-function bot, written in OCaml Most of the functions of this bot are used in the development of Coq.A subset of functions (most notably the ability to synchronize GitHubpull requests to GitLab

  • IBM 开源的一个 DIY 纸板机器人:TJBot ,召集世界各地的 Bot 爱好者来制作属于自己的个性化 Bot。 TJBot 延续了手工社区的精神,它是一套 DIY 工具包,可让你建立由 Waston 驱动的可编程纸板机器人。该机器人由一块切割的纸板(可以是 3D 打印或者激光切割)、Raspberry Pi 和多种插件(包括一个 RGB LED 灯、一个麦克风、一个伺服电机和一个摄像头)构成

  • js-bot 是一个基于 酷 Q Websocket 服务(CoolQ HTTP API 插件)的浏览器端聊天机器人框架及开发工具,用 Typescript + React 开发,你可以在“聊天模式”下与好友/群聊天,也可以在其 “控制台模式” 下输入 Javascript 代码运行或调用 js-bot 及 coolq-http 提供的 api ,并注册消息等事件的响应函数来实现自己的机器人,还可