Hugging Face Datasets uses Apache arrow for manipulating the data and to add a column hugging face has provided an extension to do so.

This got released in the latest version of the library as of the blog publishing date.

Add New Column

Code is straightforward for doing the same but with few minor observations.

When you load the datasets you get a DataSetDict which provides a dictionary of datasets and you have to choose the key. for eg in the above code, you can find it as a train

Happy coding !!!

--

--

Photo by Markus Spiske on Unsplash

Hugging face datasets provide a nice interface to load different types of ML datasets. It comes with a cleaner interface to load, process, and save data.

Install

You can use the following pip command to install the Datasets library.

pip install datasets

Load Data

To load CSV data, You need to use the load_datasets interface for the same.

Save/Convert to_csv

To Save or convert to CSV, You can use the following code.

--

--