[tldoc] Script to check the links in the documentation
Uwe Ziegenhagen
ziegenhagen at gmail.com
Sun May 12 21:27:50 CEST 2013
Hi everyone,
I just wrote a small Python script to check if the links inside the
documentation still work:
# http://www.noah.org/wiki/RegEx_Python
import re
import urllib2
#filehandle =
open("C:/Users/Uwe/Desktop/texlive/texlive-de/texlive-de-new.tex")
filehandle = open("C:/Users/Uwe/Desktop/texlive/texlive-en/texlive-en.tex")
text = filehandle.read()
filehandle.close()
m =
re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_ at .&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+',
text)
i = 0
for item in m:
i=i+1
print i, '\t', item, '\t',
try:
response = urllib2.urlopen(item)
except urllib2.HTTPError, e:
print e.code
except urllib2.URLError, u:
print u.args
print "\n"
Maybe you find it helpful. In my German version I found 11 broken links, in
the English version three:
http://groups.google.com/group/comp.text.tex/topics
http://ctan.example.org/tex-archive/systems/texlive/tlnet/
http://mirror.ctan.org/tex-archive/fonts/greek/cb
Uwe
--
Uwe Ziegenhagen
<http://www.uweziegenhagen.de>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tug.org/pipermail/tldoc/attachments/20130512/c1d77c32/attachment.html>
More information about the tldoc
mailing list