Is there a way to download the exact source code of a webpage? I have tried using the URL method and Jsoup method, but I am not getting the precise data as seen in the actual source code. For example:
<input type="image"
name="ctl00$dtlAlbums$ctl00$imbAlbumImage"
id="ctl00_dtlAlbums_ctl00_imbAlbumImage"
title="Independence Day Celebr..."
border="0"
onmouseover="AlbumImageSlideShow('ctl00_dtlAlbums_ctl00_imbAlbumImage','ctl00_dtlAlbums_ctl00_hdThumbnails','0','Uploads/imagegallary/135/Thumbnails/IMG_3206.JPG','Uploads/imagegallary/135/Thumbnails/');"
onmouseout="AlbumImageSlideShow('ctl00_dtlAlbums_ctl00_imbAlbumImage','ctl00_dtlAlbums_ctl00_hdThumbnails','1','Uploads/imagegallary/135/Thumbnails/IMG_3206.JPG','Uploads/imagegallary/135/Thumbnails/');"
src="Uploads/imagegallary/135/Thumbnails/IMG_3206.JPG"
alt="Independence Day Celebr..."
style="height:79px;width:148px;border-width:0px;"
/>
The 'style' attribute in this tag is not being detected by the Jsoup code. Additionally, when downloading using the URL method, the style tag gets changed into a border=""/> attribute.
I have tried the following code:
URL url=new URL("http://www.apcob.org/");
InputStream is = url.openStream(); // throws an IOException
BufferedReader br = new BufferedReader(new InputStreamReader(is));
String line;
File fileDir = new File(contextpath+"\\extractedtxt.txt");
Writer fw = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(fileDir), "UTF8"));
while ((line = br.readLine()) != null)
{
fw.write("\n"+line);
}
InputStream in = new FileInputStream(new File(contextpath+"extractedtxt.txt";));
String baseUrl="http://www.apcob.org/";
Document doc=Jsoup.parse(in,"UTF-8",baseUrl);
System.out.println(doc);
Another method I attempted is:
Document doc = Jsoup.connect(url_of_currentpage).get();
I am trying to achieve this in Java for the website '' where this issue is happening.