a-pandas-ex-string-to-dtypes 0.1

Creator: bradpython12

Last updated:

Add to Cart

Description:

apandasexstringtodtypes 0.1

What is it used for?
Convert a Pandas DataFrame/Series with dtype str/string/object to the best available dtypes
Installation
pip install a-pandas-ex-string-to-dtypes

Usage
from a_pandas_ex_string_to_dtypes import pd_add_string_to_dtypes

import pandas as pd

pd_add_string_to_dtypes()

df = pd.read_csv("https://github.com/pandas-dev/pandas/raw/main/doc/data/titanic.csv")

print(df)

print(df.dtypes)





PassengerId Survived Pclass ... Fare Cabin Embarked

0 1 0 3 ... 7.2500 NaN S

1 2 1 1 ... 71.2833 C85 C

2 3 1 3 ... 7.9250 NaN S

3 4 1 1 ... 53.1000 C123 S

4 5 0 3 ... 8.0500 NaN S

.. ... ... ... ... ... ... ...

886 887 0 2 ... 13.0000 NaN S

887 888 1 1 ... 30.0000 B42 S

888 889 0 3 ... 23.4500 NaN S

889 890 1 1 ... 30.0000 C148 C

890 891 0 3 ... 7.7500 NaN Q

[891 rows x 12 columns]



PassengerId int64

Survived int64

Pclass int64

Name object

Sex object

Age float64

SibSp int64

Parch int64

Ticket object

Fare float64

Cabin object

Embarked object

dtype: object











dfstring = pd.concat(

[df[x].astype("string") for x in df.columns], axis=1, ignore_index=True

)

dfstring.columns=df.columns

print(dfstring)

print(dfstring.dtypes)



PassengerId Survived Pclass ... Fare Cabin Embarked

0 1 0 3 ... 7.25 <NA> S

1 2 1 1 ... 71.2833 C85 C

2 3 1 3 ... 7.925 <NA> S

3 4 1 1 ... 53.1 C123 S

4 5 0 3 ... 8.05 <NA> S

.. ... ... ... ... ... ... ...

886 887 0 2 ... 13.0 <NA> S

887 888 1 1 ... 30.0 B42 S

888 889 0 3 ... 23.45 <NA> S

889 890 1 1 ... 30.0 C148 C

890 891 0 3 ... 7.75 <NA> Q

[891 rows x 12 columns]





PassengerId string

Survived string

Pclass string

Name string

Sex string

Age string

SibSp string

Parch string

Ticket string

Fare string

Cabin string

Embarked string

dtype: object







converted = dfstring.ds_string_to_best_dtype()

print(converted)

print(converted.dtypes)

PassengerId Survived Pclass ... Fare Cabin Embarked

0 1 0 3 ... 7.2500 <NA> S

1 2 1 1 ... 71.2833 C85 C

2 3 1 3 ... 7.9250 <NA> S

3 4 1 1 ... 53.1000 C123 S

4 5 0 3 ... 8.0500 <NA> S

.. ... ... ... ... ... ... ...

886 887 0 2 ... 13.0000 <NA> S

887 888 1 1 ... 30.0000 B42 S

888 889 0 3 ... 23.4500 <NA> S

889 890 1 1 ... 30.0000 C148 C

890 891 0 3 ... 7.7500 <NA> Q

[891 rows x 12 columns]





PassengerId uint16

Survived uint8

Pclass uint8

Name string

Sex category

Age object

SibSp uint8

Parch uint8

Ticket object

Fare float64

Cabin category

Embarked category

dtype: object





Parameters:

df: Union[pd.DataFrame, pd.Series]

Returns:

Union[pd.DataFrame, pd.Series]

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.