问题：

无法使用URLConnection连接到特定网站，读取超时

燕和裕

2023-03-14

我正在使用此代码：

public static void main(String[] args) throws IOException {
    String EngLink;
    URL EngUrl;
    URLConnection EngCon;
    String cookiesHeader;
    InputStream EngIs;
    BufferedReader EngBr;
    String line;
    String EngPageHtml="";

    EngLink="https://www.zomato.com/";
    EngUrl = new URL(EngLink);
    EngCon = (HttpURLConnection) EngUrl.openConnection();
    EngCon.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB;     rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 (.NET CLR 3.5.30729)");

    EngIs = EngCon.getInputStream();
    EngBr = new BufferedReader(new InputStreamReader(EngIs,"UTF-8"));

    while ((line = EngBr.readLine()) != null) {
        EngPageHtml = EngPageHtml + "\n" + line;
    }

    System.out.println(EngPageHtml);        
    }

我试图做的是获取网站的原始html。但是，当我运行代码时，我得到了这个错误：

Exception in thread "main" java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at sun.security.ssl.InputRecord.readFully(Unknown Source)
at sun.security.ssl.InputRecord.read(Unknown Source)
at sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source)
at sun.security.ssl.SSLSocketImpl.readDataRecord(Unknown Source)
at sun.security.ssl.AppInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read1(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(Unknown Source)
at project1.Aaa.main(Aaa.java:33)

我正在使用这段代码成功地获取多个其他网站的HTML，但这段代码不起作用。

有什么问题？我该如何解决？

编辑：在firefox中加载站点，从中获取cookie并将其传入：

EngCon.setRequestProperty("Cookie",cookie);

使页面加载，但这并不好，因为它不能一遍又一遍地使用。

共有1个答案

堵恺

2023-03-14

通过添加另一个请求属性可以解决此问题：

EngCon.setRequestProperty("Accept-Language", "en-US,en;q=0.5");

不需要其他任何东西

类似资料：

jsoup-无法获取特定网站

我正在为Java开发人员使用最新的Eclipse IDE中的最新jsoup(1.13.1)（包括孵化组件）版本：2020-09(4.17.0)构建ID：20200910-1200。我试图解析一个非常具体的网站，但没有成功。在我执行这些行之后：doc=jsoup.connect（“http://pokehb.pw/%d7%a2%d7%95%d7%a0%d7%94/21/%d7%a4%d7%a8%d
无法使用Python Beautiful Soup刮取特定网站

我一直试图用BS刮这个网页，但没有用。有人能帮帮我吗？我不确定这个网页有什么问题，或者我的代码有什么问题。
URLConnection的连接、超时、关闭用法总结

URLConnection的连接、超时、关闭用法总结
无法使用Spark连接器读取GreenPlum

请有人能帮我摆脱这个问题。以下错误：java.lang.IllegalArgumentException:''在“schema_name”.“table_name”表中不存在。“table_name”表在io.pivotal.greenplum.spark.greenplumRelationProvider.createrelation(GreenplumRelationProvider.sca
如何读取UDP连接，直到达到超时？

问题内容：我需要读取UDP流量，直到达到超时为止。我可以通过在UDPConn上调用SetDeadline并循环直到出现I / O超时错误来做到这一点，但这似乎有些破绽（基于错误条件的流控制）。以下代码段似乎更正确，但不会终止。在生产中，这显然将在goroutine中执行；为了简单起见，它被编写为主要功能。给定程序为什么不终止？基于https://gobyexample.com/select，h
无法使用Redis cli连接到AWS Redis终结点。连接超时

我创建了Redis在ElastiCache下。它是可用的。我有endpoint：。我”发送到实例（通过和）。我正试图通过键入以下内容通过redis cli进行连接： Redis-cli-hportal-test.abcdef.ab.0001.abcd1.cache.amazonaws.com-p 6379 ping 我期待，但我得到了无法在portal-test.abcdef.ab.0001

无法使用URLConnection连接到特定网站，读取超时

共有1个答案

相关问答

相关文章

相关阅读

相关工具

相关文档