satish1v

Hugging Face Datasets uses Apache arrow for manipulating the data and to add a column hugging face has provided an extension to do so.

This got released in the latest version of the library as of the blog publishing date.

Add New Column

Code is straightforward for doing the same but with few minor observations.

When you load the datasets you get a DataSetDict which provides a dictionary of datasets and you have to choose the key. for eg in the above code, you can find it as a train

Happy coding !!!

--

--

Photo by Markus Spiske on Unsplash

Hugging face datasets provide a nice interface to load different types of ML datasets. It comes with a cleaner interface to load, process, and save data.

Install

You can use the following pip command to install the Datasets library.

pip install datasets

Load Data

To load CSV data, You need to use the load_datasets interface for the same.

Save/Convert to_csv

To Save or convert to CSV, You can use the following code.

--

--

Photo by K. Mitch Hodge on Unsplash

System.Text.Json is one of the most used JSON libraries in dotnet, and it supports Pascal casing by default and provides an extra Camel casing support as the configuration parameter.

Let's start our Custom Naming Policy

Domain Logic

First is the domain or algorithm to convert the string to a snake case.

I got this from an Entity framework core project, and they support different naming Policies.

The next step is to Override the JsonNamingPolicy.

Now that the Snake Case Policy is available, it's time to add it to the Json Serialization Options property.

The JsonSerializerOptions class should not be created multiple times to optimize the code, as is the naming policy. We have created a static instance of the Naming class.

--

--

TLDR version, If you are trying to find how to inject the IConfiguration, Well you don't need to as it's already available in the builder.

If you are using dotnet 5 or the earlier version then you need to inject the configuration object into the startup and use the same.

Now with .net 6 its become quite easy.

Simply and good-looking code in dotnet 6.

--

--

Photo by Clay Banks on Unsplash

In the side hustle which I am working on, I am using AWS Cloudformation SDK with Serverless stack to create DynamoDB.

When it comes to coding in c#. I am using the dynamodb object persistence layer for interacting with the DB. Object Persistence model using the C# attribute to specify the Table name.

For Example

Serverless stack or Cloudformation in this case creates the Table name dynamically based on the env for example in the case of Dev the name of the table will be dev-todo-storage.

Since it’s an attribute in .net, we cant dynamically replace it and it's a mandatory attribute for the object persistence to work.

Finding a WorkAround

After some searching around the dynamodb code, I came across the DynamoDBOperationConfig class. This class provides functionality to override the table name dynamically.

Simple but not easily searchable solution.

--

--