scihub2pdf is a module of bibcure
Downloads pdfs via a DOI number, article title or a bibtex file, using thedatabase of libgen, Sci-Hub and Arxiv.
$ sudo python /usr/bin/pip install scihub2pdf
If you want to download files from scihub you will need to get PhantomJS
$ npm install -g phantomjs
$ sudo apt-get install npm
$ sudo npm install -g phantomjs
Given a bibtex file
$ scihub2pdf -i input.bib
Given a DOI number...
$ scihub2pdf 10.1038/s41524-017-0032-0
Given a title...
$ scihub2pdf --title An useful paper
Arxiv...
$ scihub2pdf arxiv:0901.2686
$ scihub2pdf --title arxiv:Periodic table for topological insulators
Location folder as argument
$ scihub2pdf -i input.bib -l somefoler/
Use libgen instead sci-hub
$ scihub2pdf -i input.bib --uselibgen
Given a text file like
10.1038/s41524-017-0032-0
10.1063/1.3149495
.....
download all pdf's
$ scihub2pdf -i dois.txt --txt
Given a text file like
Some Title 1
Some Title 2
.....
download all pdf's
$ scihub2pdf -i titles.txt --txt --title
Given a text file like
arXiv:1708.06891
arXiv:1708.06071
arXiv:1708.05948
.....
download all pdf's
$ scihub2pdf -i arxiv_ids.txt --txt
scihub是科研利器,这就不多说了,白嫖文献的法门,一般采用的是网页或者桌面程序,一般都会跳转到网页进行加载出文献,但是这很不方便,毕竟全手动,这里无意中看到一个写好的pip工具scihub2pdf ,于是试一下它手动威力,如果这能够成功,也就是我们以后如果想批量下载也是没问题的。 1.首先我们得安装它: pip install scihub2pdf 2.紧接着安装npm和phantom
python sci sci-hub 爬论文 思路:通过百度学术爬取DOI号,再访问scihub来获取论文下载链接 下载文献加入时间 貌似一次访问太多scihub就访问不到了,控制在10-20篇吧 最新的文献,scihub上很多都没有,所以不用sort=sc_time取最新的,改用时间范围 pages 设为10的倍数就好 可选参数不想填回车就好 # -*- coding: utf-8 -*- #