java如何获取网页
获取网页内容的方法
使用Java获取网页内容可以通过多种方式实现,以下是几种常见的方法:
使用java.net.HttpURLConnection
HttpURLConnection是Java标准库提供的HTTP客户端工具,适合简单的HTTP请求。
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
public class HttpUrlConnectionExample {
public static void main(String[] args) throws Exception {
URL url = new URL("https://example.com");
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("GET");
int responseCode = connection.getResponseCode();
if (responseCode == HttpURLConnection.HTTP_OK) {
BufferedReader in = new BufferedReader(new InputStreamReader(connection.getInputStream()));
String inputLine;
StringBuilder response = new StringBuilder();
while ((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
in.close();
System.out.println(response.toString());
} else {
System.out.println("GET request failed. Response Code: " + responseCode);
}
}
}
使用java.net.http.HttpClient(Java 11+)
Java 11引入了新的HttpClient,支持HTTP/2和异步请求。
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
public class HttpClientExample {
public static void main(String[] args) throws Exception {
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create("https://example.com"))
.build();
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
System.out.println(response.body());
}
}
使用第三方库(Apache HttpClient)
Apache HttpClient提供了更丰富的功能,适合复杂场景。
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
public class ApacheHttpClientExample {
public static void main(String[] args) throws Exception {
CloseableHttpClient client = HttpClients.createDefault();
HttpGet request = new HttpGet("https://example.com");
CloseableHttpResponse response = client.execute(request);
try {
String result = EntityUtils.toString(response.getEntity());
System.out.println(result);
} finally {
response.close();
}
}
}
使用Jsoup解析HTML
如果需要解析HTML内容,可以使用Jsoup库。

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
public class JsoupExample {
public static void main(String[] args) throws Exception {
Document doc = Jsoup.connect("https://example.com").get();
String title = doc.title();
System.out.println("Title: " + title);
}
}
注意事项
- 处理异常:网络请求可能抛出
IOException或其他异常,需妥善处理。 - 设置超时:避免请求长时间阻塞,建议设置连接和读取超时。
- 请求头:某些网站可能需要设置
User-Agent或其他请求头。 - HTTPS:确保目标URL支持HTTPS,否则可能需要处理SSL证书问题。
以上方法适用于大多数场景,可根据需求选择适合的方式。






