Hugging Face Datasets uses Apache arrow for manipulating the data and to add a column hugging face has provided an extension to do so.

This got released in the latest version of the library as of the blog publishing date.

Add New Column

Code is straightforward for doing the same but with few minor observations.

When you load the datasets you get a DataSetDict which provides a dictionary of datasets and you have to choose the key. for eg in the above code, you can find it as a train

Happy coding !!!



Photo by Markus Spiske on Unsplash

Hugging face datasets provide a nice interface to load different types of ML datasets. It comes with a cleaner interface to load, process, and save data.


You can use the following pip command to install the Datasets library.

pip install datasets

Load Data

To load CSV data, You need to use the load_datasets interface for the same.

Save/Convert to_csv

To Save or convert to CSV, You can use the following code.



Photo by K. Mitch Hodge on Unsplash

System.Text.Json is one of the most used JSON libraries in dotnet, and it supports Pascal casing by default and provides an extra Camel casing support as the configuration parameter.

Let's start our Custom Naming Policy

Domain Logic

First is the domain or algorithm to convert the string to a snake case.

I got this from an Entity framework core project, and they support different naming Policies.

The next step is to Override the JsonNamingPolicy.

Now that the Snake Case Policy is available, it's time to add it to the Json Serialization Options property.

The JsonSerializerOptions class should not be created multiple times to optimize the code, as is the naming policy. We have created a static instance of the Naming class.



TLDR version, If you are trying to find how to inject the IConfiguration, Well you don't need to as it's already available in the builder.

If you are using dotnet 5 or the earlier version then you need to inject the configuration object into the startup and use the same.

Now with .net 6 its become quite easy.

Simply and good-looking code in dotnet 6.