如何java读网页
使用 Java 读取网页内容
使用 java.net.URL 和 java.io.BufferedReader
通过 URL 类和 BufferedReader 可以快速读取网页内容。以下是示例代码:
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.URL;
public class ReadWebPage {
public static void main(String[] args) throws Exception {
URL url = new URL("https://example.com");
BufferedReader reader = new BufferedReader(new InputStreamReader(url.openStream()));
String line;
while ((line = reader.readLine()) != null) {
System.out.println(line);
}
reader.close();
}
}
使用 HttpURLConnection 处理 HTTP 请求
HttpURLConnection 提供了更多控制选项,例如设置请求方法、超时和头部信息:

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
public class ReadWebPageWithConnection {
public static void main(String[] args) throws Exception {
URL url = new URL("https://example.com");
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("GET");
conn.setConnectTimeout(5000);
conn.setReadTimeout(5000);
BufferedReader reader = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line;
while ((line = reader.readLine()) != null) {
System.out.println(line);
}
reader.close();
}
}
使用第三方库(如 Apache HttpClient)
Apache HttpClient 提供了更高级的功能和更简洁的 API:

import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
public class ReadWebPageWithHttpClient {
public static void main(String[] args) throws Exception {
CloseableHttpClient client = HttpClients.createDefault();
HttpGet request = new HttpGet("https://example.com");
CloseableHttpResponse response = client.execute(request);
String content = EntityUtils.toString(response.getEntity());
System.out.println(content);
response.close();
client.close();
}
}
使用 Jsoup 解析 HTML
如果需要解析 HTML 内容,可以使用 Jsoup 库:
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
public class ParseWebPageWithJsoup {
public static void main(String[] args) throws Exception {
Document doc = Jsoup.connect("https://example.com").get();
String title = doc.title();
System.out.println("Title: " + title);
}
}
处理异常和超时
在实际应用中,需要处理网络异常和超时:
try {
URL url = new URL("https://example.com");
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setConnectTimeout(5000);
conn.setReadTimeout(5000);
// 其他操作
} catch (Exception e) {
e.printStackTrace();
}
总结
- 对于简单需求,使用
URL和BufferedReader足够。 - 需要更多控制时,使用
HttpURLConnection。 - 高级功能(如连接池、重试)可选用 Apache HttpClient。
- 解析 HTML 内容时,Jsoup 是理想选择。






