Version 0.7
Copyright © 2008 - 2010 Lars Vogel
30.08.2010
| Revision History | ||
|---|---|---|
| Revision 0.1 | 03.10.2008 | Lars Vogel |
| First Version | ||
| Revision 0.2 - 0.7 | 01.05.2009 - 22.04.2010 | Lars Vogel |
| bug fixes and enhancements | ||
Table of Contents
Java provides API's to access resources over the network, for example to read webpages. The main classes which are used to read web resources is "java.net.URL" and "java.net.HttpURLConnection". URL can be used to define a web resources while "HttpURLConnection" can be used to access the web resource.
The Apache Foundation provides a powerful framework to to transmit and receive HTTP messages via the HttpClient . Both approaches will be demonstrated in this tutorial.
Create a Java project "de.vogella.web.html". The following code will read a webpage from a sever and print it to the console.
package de.vogella.web.html;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;
public class ReadWebPage {
public static void main(String[] args) throws IOException {
String urltext = "http://www.vogella.de";
URL url = new URL(urltext);
BufferedReader in = new BufferedReader(new InputStreamReader(url
.openStream()));
String inputLine;
while ((inputLine = in.readLine()) != null) {
// Process each line.
System.out.println(inputLine);
}
in.close();
}
}
HTML return codes are standardized codes which a web server returns if a certain situation has occurred. For example the return code "200" means the HTML request is ok and the server will perform the require action, e.g. serving the webpage.
The following code will access web page and print the return code for the HTML access.
The most important HTML return codes are:
Table 1.
| Return Code | Explaination |
|---|---|
| 200 | Ok |
| 301 | Permanent redirect to another webpage |
| 400 | Bad request |
| 404 | Not found |
package de.vogella.web.html;
import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.URL;
public class ReadReturnCode {
public static void main(String[] args) throws IOException {
String urltext = "http://www.vogella.de";
URL url = new URL(urltext);
int responseCode = ((HttpURLConnection) url.openConnection())
.getResponseCode();
System.out.println(responseCode);
}
}
The Internet media type (short MIME) which is also called Content-type define the type of the web resource. The MIME type is a two-part identifier for file formats on the Internet. For html page the content-type is "text/html".
The following code will check for the return code of an URL and will get the content-type (MIME-Typ) for the web resource.
package de.vogella.web.html;
import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.URL;
public class ReadMimeType {
public static void main(String[] args) throws IOException {
String urltext = "http://www.vogella.de";
URL url = new URL(urltext);
String contentType = ((HttpURLConnection) url.openConnection())
.getContentType();
System.out.println(contentType);
}
}
Several websites offer services via Http get calls. For example your can send a get request to "http://tinyurl" or http://tr.im" and receive a short version of the Url you pass as parameter.
The following will demonstrate how to call the get service from "http://TinyUrl" or "http://tr.im" via Java.
Create the Java project "de.vogella.web.get" and create the following classes which will call a getService and return the result.
package de.vogella.web.get;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;
public class TinyURL {
private static final String tinyUrl = "http://tinyurl.com/api-create.php?url=";
public String shorter(String url) throws IOException {
String tinyUrlLookup = tinyUrl + url;
BufferedReader reader = new BufferedReader(new InputStreamReader(new URL(tinyUrlLookup).openStream()));
String tinyUrl = reader.readLine();
return tinyUrl;
}
}
package de.vogella.web.get;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;
public class Trim {
private static final String trimUrl = "http://api.tr.im/v1/trim_simple?url=";
public String shorter(String url) throws IOException {
String tinyUrlLookup = trimUrl + url;
BufferedReader reader = new BufferedReader(new InputStreamReader(new URL(tinyUrlLookup).openStream()));
String tinyUrl = reader.readLine();
return tinyUrl;
}
}
And a little test.
package de.vogella.web.get;
import java.io.IOException;
public class Test {
/**
* @param args
* @throws IOException
*/
public static void main(String[] args) throws IOException {
String s = "http://www.vogella.de";
TinyURL tiny = new TinyURL();
System.out.println(tiny.shorter(s));
Trim trim= new Trim ();
System.out.println(trim.shorter(s));
}
}
The Apache HttpClient library simplifies handling HTTP requests. Download the necessary libraries from http://hc.apache.org/
As the time of the writing HttpClient requires also additional libraries, currently they are described here: http://hc.apache.org/httpclient-3.x/dependencies.html
In our eample we load stock information data from the Yahoo! Finance service. Yahoo! provides automatically generated CSV files of all stocks on the market. To get this we only have to download a file from this link http://ichart.finance.yahoo.com/table.csv?s=JAVA . It gives you a csv file with the history of the stock. As you can see, this "service" has the parameter s=JAVA with the name of the stock.
Create the Java project "de.vogella.web.httpclient", download the HttpClient libraries with dependencies from the Apache Website and add them to the path of your Java project.
Interesting for you is that we need the method updateHistory of the GetStocksHistory class with a string parameter(stock name) , which returns us a ArrayList.
package de.vogella.stockticker;
import java.util.ArrayList;
public class StockHistory {
ArrayList<StockData> m_stockHistory;
public StockHistory() {
m_stockHistory = new ArrayList<StockData>();
}
public void add (StockData data) {
m_stockHistory.add(data);
}
}
package de.vogella.stockticker;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.text.NumberFormat;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Calendar;
import java.util.Iterator;
import java.util.Locale;
import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.HttpMethod;
import org.apache.commons.httpclient.methods.GetMethod;
public class GetStockHistory {
SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd", Locale.US);
NumberFormat nf = NumberFormat.getInstance(Locale.US);
NumberFormat pf = NumberFormat.getInstance(Locale.US);
StockHistory history = new StockHistory();
public ArrayList<StockData> updateHistory(String company) {
Calendar from = Calendar.getInstance();
Calendar to = Calendar.getInstance();
to.set(2009, 01, 01);
from.set(2006, 01, 01);
System.out.println(to.toString());
StringBuffer url = new StringBuffer(
"http://ichart.finance.yahoo.com/table.csv" + "?s=");
url.append(company);
try {
HttpClient client = new HttpClient();
client.getHostConfiguration().setProxy("proxy", 8080); // Delete
// this Line
// if you
// don't use
// a proxy
client.getHttpConnectionManager().getParams().setConnectionTimeout(
5000);
System.out.println(url.toString());
HttpMethod method = new GetMethod(url.toString());
method.setFollowRedirects(true);
client.executeMethod(method);
BufferedReader in = new BufferedReader(new InputStreamReader(method
.getResponseBodyAsStream()));
// The first line is the header, ignoring
String inputLine = in.readLine();
while ((inputLine = in.readLine()) != null) {
if (inputLine.startsWith("<"))
continue;
String[] item = inputLine.split(",");
if (item.length < 6)
continue;
Calendar day = Calendar.getInstance();
day.setTime(df.parse(item[0].replace("\"", "")));
day.set(Calendar.HOUR, 0);
day.set(Calendar.MINUTE, 0);
day.set(Calendar.SECOND, 0);
day.set(Calendar.MILLISECOND, 0);
StockData data = new StockData();
data.setOpen(Double.parseDouble(item[1].replace(',', '.')));
data.setHigh(Double.parseDouble(item[2].replace(',', '.')));
data.setLow(Double.parseDouble(item[3].replace(',', '.')));
data.setClose(Double.parseDouble(item[4].replace(',', '.')));
data.setVolume(Long.parseLong(item[5]));
data.setDate(df.parse(item[0].replace("\"", "")));
history.add(data);
}
in.close();
return history.m_stockHistory;
} catch (Exception e) {
System.out.println(e.getMessage());
return null;
}
}
public static void main(String[] args) {
// GetStockHistory gsh = new GetStockHistory();
// ArrayList<StockData> sd = gsh.getNewHistory();
//
// for(int i=0;i < sd.size(); i++)
// {
// System.out.println(sd.get(i).getDate());
// }
}
}
the proxy can be set via System.setProperty. For example if your proxy is called "proxy" and runs on port "8080" the following code will set the proxy.
java -Dhttp.proxyHost=proxy -Dhttp.proxyPort=8080 JavaProgram
the proxy can be set via System.setProperty. For example if your proxy is called "proxy" and runs on port "8080" the following code will set the proxy.
System.setProperty("http.proxySet", "true");
System.setProperty("http.proxyHost", "proxy");
System.setProperty("http.proxyPort", "8080");
Thank you for practicing with this tutorial.
I maintain this tutorial in my private time. If you like the information please help me by using flattr or donating or by
|
Before posting questions, please see the vogella FAQ . If you have questions or find an error in this article please use the www.vogella.de Google Group . I have created a short list how to create good questions which might also help you. .