JSONException being thrown on Twitter search - Java

5

I have an algorithm that does a search for old tweets on Twitter. The application ran normally for a few days, but out of nowhere it started throwing an exception.

Code:

public static List<Tweet> getTweets(String username, String since, String until, String querySearch) {
    List<Tweet> results = new ArrayList<Tweet>();

    try {
        String refreshCursor = null;
        while (true) {              
            JSONObject json = new JSONObject(getURLResponse(username, since, until, querySearch, refreshCursor));
            refreshCursor = json.getString("scroll_cursor");   // <<--------
            System.out.println("while");
            Document doc = Jsoup.parse((String) json.get("items_html"));
            Elements tweets = doc.select("div.js-stream-tweet");

            if (tweets.size() == 0) {
                break;
            }

            for (Element tweet : tweets) {
                String usernameTweet = tweet.select("span.username.js-action-profile-name b").text();
                String txt = tweet.select("p.js-tweet-text").text().replaceAll("[^\u0000-\uFFFF]", "");
                int retweets = Integer.valueOf(tweet.select("span.ProfileTweet-action--retweet span.ProfileTweet-actionCount").attr("data-tweet-stat-count").replaceAll(",", ""));
                int favorites = Integer.valueOf(tweet.select("span.ProfileTweet-action--favorite span.ProfileTweet-actionCount").attr("data-tweet-stat-count").replaceAll(",", ""));
                long dateMs = Long.valueOf(tweet.select("small.time span.js-short-timestamp").attr("data-time-ms"));
                Date date = new Date(dateMs);

                Tweet t = new Tweet(usernameTweet, txt, date, retweets, favorites);
                results.add(t);
            }
        }
    } catch (Exception e) {
        System.out.println("Error!");
    }

    return results;
}

In the line " <<-------- " the exception is being thrown. The json object has the page content returned, so I do not know what's happening.

This is the method that requests the page:

private static String getURLResponse(String from, String since, String until, String querySearch, String scrollCursor) throws Exception {
    String appendQuery = "";
    if (from != null) {
        appendQuery += "from:"+from;
    }
    if (since != null) {
        appendQuery += " since:"+since;
    }
    if (until != null) {
        appendQuery += " until:"+until;
    }
    if (querySearch != null) {
        appendQuery += " "+querySearch;
    }

    String url = String.format("https://twitter.com/i/search/timeline?f=realtime&q=%s&src=typd&scroll_cursor=%s", URLEncoder.encode(appendQuery, "UTF-8"), scrollCursor);

    URL obj = new URL(url);
    HttpURLConnection con = (HttpURLConnection) obj.openConnection();

    con.setRequestMethod("GET");

    BufferedReader in = new BufferedReader(
            new InputStreamReader(con.getInputStream()));
    String inputLine;
    StringBuffer response = new StringBuffer();

    while ((inputLine = in.readLine()) != null) {
        response.append(inputLine);
    }
    in.close();

    return response.toString();
}

Try / catch exception:

twitter4j.JSONException: JSONObject["scroll_cursor"] not found.
    at twitter4j.JSONObject.get(JSONObject.java:390)
    at twitter4j.JSONObject.getString(JSONObject.java:504)
    at Manager.TweetManager.getTweets(TweetManager.java:83)
    at Main.Main.main(Main.java:52)

JSON 1 JSON 2, JSON 3,

    
asked by anonymous 05.07.2015 / 17:17

1 answer

5

Your problem-giving JSON is basically this:

{
    "has_more_items": false,
    "items_html": "<um monte de html...>",
    "focused_refresh_interval": 30000
}

That is, basically your code fails when your search reaches the end of the results, because in this case there is no item called scroll_cursor .

Here's what you do then. Instead:

JSONObject json = new JSONObject(getURLResponse(username, since, until, querySearch, refreshCursor));
refreshCursor = json.getString("scroll_cursor");   // <<--------

Put this:

JSONObject json = new JSONObject(getURLResponse(username, since, until, querySearch, refreshCursor));
boolean hasMore = json.getBoolean("has_more_items"); 
refreshCursor = hasMore ? json.getString("scroll_cursor") : null;

And at the end of the loop while , put this:

if (!hasMore) break;

And as Bruno César noted in a comment below:

  

Instead of break it could be used hasMore itself, start it as true outside the loop, while(hasMore) fetch the next tweets, since it always refreshes hasMore itself.

    
05.07.2015 / 18:34