Issue
I've been using Jsoup in order to fetch certain words from google search but it fails to my understanding in the Jsoup query process.
It's getting successfully into the doInBackground method but it won't print the title and body of each link on the search.
My guess is that the list I'm getting from doc.select (links) is empty. which brings it to query syntax problem
value - it's the keyword search, in my case, it's a barcode that actually works. Here's the link
Here it's the async call from another class:
String url = "https://www.google.com/search?q=";
if (!value.isEmpty())
{
url = url + value + " price" + "&num10";
Scrape_Asynctasks task = new Scrape_Asynctasks();
task.execute(url);
}
and here is the async task itself:
public class Scrape_Asynctasks extends AsyncTask<String, Integer, String>
{
@Override
protected void onPreExecute() {
super.onPreExecute();
}
@Override
protected String doInBackground(String... strings) {
try
{
Log.i("IN", "ASYNC");
final Document doc = Jsoup
.connect(strings[0])
.userAgent("Jsoup client")
.timeout(5000).get();
Elements links = doc.select("li[class=g]");
for (Element link : links)
{
Elements titles = link.select("h3[class=r]");
String title = titles.text();
Elements bodies = link.select("span[class=st]");
String body = bodies.text();
Log.i("Title: ", title + "\n");
Log.i("Body: ", body);
}
}
catch (IOException e)
{
Log.i("ERROR", "ASYNC");
}
return "finished";
}
@Override
protected void onProgressUpdate(Integer... values) {
super.onProgressUpdate(values);
}
@Override
protected void onPostExecute(String s) {
super.onPostExecute(s);
}
}
Solution
- Don't use "Jsoup client" as your user agent string. Use the same string as your browser, eg.
"Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:68.0) Gecko/20100101 Firefox/68.0"
. Some sites (including google) don't like it. - Your first selector should be
.g
:Elements links = doc.select(".g");
- The sites uses javascript, so you will not get all the results as you get in your browser.
You can disable JS in your browser and see the difference.
Answered By - TDG
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.