Skip to main content

Date Functionality

In today's lecture, we explored time series and date functionality in pandas. Manipulating dates and times in pandas is highly flexible, enabling us to conduct advanced analysis such as time series analysis.

Timestamp

Pandas has four main time-related classes: Timestamp, DatetimeIndex, Period, and PeriodIndex.

Creating Timestamps

A Timestamp represents a single point in time. It can be created using a string or by passing multiple parameters.

# Creating a Timestamp from a string
pd.Timestamp('9/1/2019 10:05AM')

# Creating a Timestamp by passing multiple parameters
pd.Timestamp(2019, 12, 20, 0, 0)

Timestamp Attributes

Timestamps have several useful attributes.

# Getting the weekday of a Timestamp (1=Monday, 7=Sunday)
pd.Timestamp(2019, 12, 20, 0, 0).isoweekday()

# Extracting the second from a Timestamp
pd.Timestamp(2019, 12, 20, 5, 2, 23).second

Period

The Period class represents a span of time rather than a specific point in time.

Creating Periods

# Creating a Period representing January 2016
pd.Period('1/2016')

# Creating a Period representing March 5, 2016
pd.Period('3/5/2016')

Arithmetic with Periods

Arithmetic operations on periods are straightforward.

# Adding 5 months to January 2016
pd.Period('1/2016') + 5

# Subtracting 2 days from March 5, 2016
pd.Period('3/5/2016') - 2

DatetimeIndex and PeriodIndex

Creating DatetimeIndex

A DatetimeIndex is the index of a series of Timestamps.

t1 = pd.Series(list('abc'), [pd.Timestamp('2016-09-01'), pd.Timestamp('2016-09-02'), pd.Timestamp('2016-09-03')])
print(t1)
print(type(t1.index)) # DatetimeIndex

Creating PeriodIndex

A PeriodIndex is the index of a series of Periods.

t2 = pd.Series(list('def'), [pd.Period('2016-09'), pd.Period('2016-10'), pd.Period('2016-11')])
print(t2)
print(type(t2.index)) # PeriodIndex

Converting to Datetime

You can convert a list of date strings to Datetime format.

d1 = ['2 June 2013', 'Aug 29, 2014', '2015-06-26', '7/12/16']
ts3 = pd.DataFrame(np.random.randint(10, 100, (4, 2)), index=d1, columns=list('ab'))
ts3.index = pd.to_datetime(ts3.index)
print(ts3)

# Parsing dates in European format
pd.to_datetime('4.7.12', dayfirst=True)

Timedelta

A Timedelta represents a difference between two dates or times.

# Calculating the difference between two dates
pd.Timestamp('9/3/2016') - pd.Timestamp('9/1/2016')

# Adding a Timedelta to a Timestamp
pd.Timestamp('9/2/2016 8:10AM') + pd.Timedelta('12D 3H')

Offset

An Offset represents calendar-based duration.

# Adding a week to a Timestamp
pd.Timestamp('9/4/2016') + pd.offsets.Week()

# Adding the end of the month to a Timestamp
pd.Timestamp('9/4/2016') + pd.offsets.MonthEnd()

Working with Dates in a DataFrame

Creating a DatetimeIndex with date_range

Using date_range, you can create a DatetimeIndex with specified start or end dates, number of periods, and frequency.

dates = pd.date_range('10-01-2016', periods=9, freq='2W-SUN')
print(dates)

# Creating a DataFrame with the DatetimeIndex
df = pd.DataFrame({'Count 1': 100 + np.random.randint(-5, 10, 9).cumsum(),
'Count 2': 120 + np.random.randint(-5, 10, 9)}, index=dates)
print(df)

Checking Day of the Week

df.index.weekday

Calculating Differences

df.diff()

Resampling Data

Resampling allows aggregation of data into different frequencies.

df.resample('M').mean()

Datetime Indexing and Slicing

You can use partial string indexing to filter data.

# Filtering by year
df['2017']

# Filtering by month
df['2016-12']

# Filtering by a range of dates
df['2016-12':]
df['2016']