How to get Tweets using Python and Twitter API v2 – Part I

QuantInsti

Contributor:
QuantInsti
Visit: QuantInsti

In this blog, we continue to explore the premium search features of the Twitter API using the API interface from Tweepy, check out the rate limits and how to deal with them, and get the local trends on Twitter.

We will also look at the Client interface provided by the Tweepy library for the Twitter API v2 and how to work with it to get different types of data from Twitter.

In the previous blog of this mini-series, we covered the methods available with the Twitter API v1 interface for getting different kinds of data from Twitter. We also looked at the types of access levels for the Twitter API.

Here, we will cover the topics:

  • Premium search
    – Search last 30 days
    – Search the full archive
  • Rate limits
    – How to check the rate limit status?
    – How to set the app to wait till the rate limit is replenished?
  • Get trends for a location
  • Tweepy client for Twitter API v2
  • Client authentication
  • Get the user name for a particular user ID using client
  • Get the user ID for a particular user name using client
  • Get the user names for multiple user Ids using client
  • Get tweet(s) with tweet Id(s) using client
  • Get a user’s followers using client
  • Get users that the user follows using client
  • Get a user’s tweets using client
  • Get tweets that a user liked using client
  • Get users who retweeted a tweet using client
  • Search recent tweets using client
  • Get tweet count for a search query using client
  • Pagination in client
  • Using expansions to get user and media information
  • Writing the search results to a text file
  • Putting the search results into a dataframe
  • Twitter API v2 GitHub

Premium search is a subscription API provided by Twitter. There are two products available with this API:

  • Search Tweets: 30-day endpoint
  • Search Tweets: Full-archive endpoint

To begin using any of the subscription APIs, you need to set up a dev environment ⁽¹⁾ for the endpoint.


Search last 30 days

Twitter provides the premium Search Tweets: 30-Day API, which gives you access to the tweets posted within the last 30 days. You can search over this database wherein tweets are matched against your query and returned.

This feature can be accessed using the search_30_day() method of the API class.

# Define the keywords for the query
keywords = "stock market"
limit = 300

# We are now fetching only the tweets specifying the search criteria from the last 30 days
tweets = tweepy.Cursor(api.search_30_day, label='test', query=keywords).items(limit)
search_last_30_days.py hosted with ❤ by GitHub

Search the full archive

We can search the full archive using the search_full_archive() method. We can specify the dates and times from and to which we want to search the archive.

# Define the keywords for the query
# Tweets about Ethereum and excluding retweets
keywords = "ethereum -RT"
limit = 300

# We are now fetching tweets specifying the search criteria in the archive
# 'fromDate' must be in format 'yyyyMMddHHmm': we are fetching tweets from 20/Aug/2021 from midnight to noon
tweets = tweepy.Cursor(api.search_full_archive, label='test', query=keywords, fromDate = '202108200000', toDate = '202108201200').items(limit)
search_full_archive.py hosted with ❤ by GitHub

Rate limits

Twitter is an invaluable source of big data, accessed by millions of developers worldwide daily. It imposes utilization limits to make the API scalable and reliable. These limits on the usage depend on your authentication method.

There are limits on the number of requests made in a specific time interval. These are called rate limits ⁽²⁾.

So how do we deal with these limits?
How will you know the status of the rate limit for you?
And what to do if your app breaches the rate limit?
Do you want it to terminate, or should it wait for the limit to get replenished?

Let us explore these methods.

How to check the rate limit status?

# Check rate limit status
# Calls to this method do not count against the rate limit
api.rate_limit_status()
check_limit_status.py hosted with ❤ by GitHub

The api.rate_limit_status() method returns the available number of API requests for the user before the API limit is reached for the current hour. If you provide the credentials for a user, this method returns the rate limit status for this user. Else, it returns the rate limit status for the requester’s IP address.

How to set the App to wait till the rate limit Is replenished?

When we initialize the API class object after authentication, we can set it up so that it waits for the rate limits to get replenished.

api = tweepy.API(auth, wait_on_rate_limit=True)

set_the_app.py hosted with ❤ by GitHub

set_the_app.py hosted with ❤ by GitHub

Trends are an important feature of Twitter. So how can we see which places are providing the trending topics currently?

The available_trends() method returns the WOE (Where On Earth) id and other human-readable information for the locations for which Twitter has trending information.

api.available_trends()
get_trends.py hosted with ❤ by GitHub

Let us now see how we can get the trending topics on Twitter for a particular location, be it a city or a country. For this, we need the WOE id of the location. You can get this from here ⁽³⁾.

Once you have the WOE id of the place, getting the local trends is just a line of code.

woeid='1' # for  World
trends_world = api.get_place_trends(id = 1)
place_trends.py hosted with ❤ by GitHub

Tweepy client for Twitter API v2

Tweepy provides the API interface for the Twitter API v1.1. For the v2 API, Tweepy provides the Client interface. This is available from Tweepy v4.0 onwards, so you may need to upgrade your Tweepy installation if you installed it a while back.

You can do this by simply running the following command:

pip install tweepy –upgrade

Client authentication

Authentication is similar to API, except you need the bearer token for your project to authenticate the client.

client = tweepy.Client(bearer_token=bearer_token, wait_on_rate_limit=True)
user_authentication.py hosted with ❤ by GitHub

Get the user name for a particular user ID using client

# using get_user with id
id = "869660137"
user = client.get_user(id=id)
print(f"The user name for user id {id} is {user.data.name}.")
get_user_name.py hosted with ❤ by GitHub

get_user_name.py hosted with ❤ by GitHub

Get the user ID for a particular user name using client

# using get_user with user name
username = "QuantInsti"
user = client.get_user(username=username)
print(f"The user name for user name {username} is {user.data.id}.")

get_user_id.py hosted with ❤ by GitHub

get_user_id.py hosted with ❤ by GitHub

Get the user names for multiple user Ids using client

Let us now fetch the details for multiple user ids. We will fetch only some user fields ⁽⁴⁾.

# using get_users with multiple ids
ids = ["869660137", "21577803", "86813650", "44196397"]
# Fetch specific user fields
users = client.get_users(ids=ids, user_fields=[
                         'name', 'username', 'description'])

for user in users.data:
    print(
        f"The user name for user id {user.id} is {user.username} and the name is {user.name}.")
    print(f"'{user.description}'\n")
get_multiple_user_names.py hosted with ❤ by GitHub

Using a similar approach, you can also try to fetch the user ids for multiple users. Try it out!

Stay tuned for Part II to learn how to get Tweet(s) with Tweet Id(s) using client.

Visit QuantInsti for additional insights on this topic: https://blog.quantinsti.com/twitter-api-v2/.

Disclosure: Interactive Brokers

Information posted on IBKR Traders’ Insight that is provided by third-parties and not by Interactive Brokers does NOT constitute a recommendation by Interactive Brokers that you should contract for the services of that third party. Third-party participants who contribute to IBKR Traders’ Insight are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from QuantInsti and is being posted with permission from QuantInsti. The views expressed in this material are solely those of the author and/or QuantInsti and IBKR is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.

In accordance with EU regulation: The statements in this document shall not be considered as an objective or independent explanation of the matters. Please note that this document (a) has not been prepared in accordance with legal requirements designed to promote the independence of investment research, and (b) is not subject to any prohibition on dealing ahead of the dissemination or publication of investment research.

Any trading symbols displayed are for illustrative purposes only and are not intended to portray recommendations.

Disclosure: API Examples Discussed

Throughout the lesson, please keep in mind that the examples discussed are purely for technical demonstration purposes, and do not constitute trading advice. Also, it is important to remember that placing trades in a paper account is recommended before any live trading.