0

Hi I'm trying to make calculation rest api using django.

My objective is to save preprocessed_data(result) into the db table field

The process is read raw data path(txt file) from db and calculate using pandas & numpy after that make result txt file save to db preprocessed_data field.

this is my models.py

class PreprocessData(models.Model):

    raw = models.FileField(
        upload_to=_user_directory_path,
        storage=OverwriteStorage(),

    preprocessed_data = models.CharField(
        max_length=200,
        null=True,
        blank=True,
    )
)

and views.py

class PreprocessCalculate(APIView):
    def _get_object(self, pk):
        try:
            data = PreprocessData.objects.get(id=pk)
            return data
        except PreprocessData.DoesNotExist:
            raise NotFound

#get or put or post which is the best design for api?

# PUT API
    def put(self, request, pk):
        data = self._get_object(pk)
        serializer = PreprocessCalculateSerializer(data, request.data, partial=True)
        if serializer.is_valid():
            updated_serializer = serializer.save()
            return Response(PreprocessDataSerializer(updated_serializer).data)
        else:
            return Response(serializer.errors, status=HTTP_400_BAD_REQUEST)

and serializers.py

class PreprocessResultField(serializers.CharField):
    def to_representation(self, value) -> str:
        ret = {"result": value.test_func()}
        return ret

    def to_internal_value(self, data):
        return data["result"]


class PreprocessCalculateSerializer(serializers.ModelSerializer):
    preprocessed_data = PreprocessResultField()

    class Meta:
        model = PreprocessData
        fields = ("uuid", "preprocessed_data")

my question is

  1. when I use the above code. In db "preprocessed_field" is still null... what is the problem in custom field serializer?

  2. I choose "PUT" method to calculate raw file, but I think if I use "PUT" it has a problem to change my "uuid" by mistake. I think it is not good .. then should I use GET or POST? to make calculation restAPI? If "PUT" is right how to keep idempotent my db?

Pleas help me...

jihyeon
  • 57
  • 5

1 Answers1

1

In reality, you can use any method you want (although, there are some principles related to it). Another important thing is that you do need to know what kind of file you are going to process. Here is a simple example, that only accepts .csv files upload, read its data and do different processes depending on the kind of request.method:

models.py

def user_directory_path(instance, filename):
    # file will be uploaded to MEDIA_ROOT/user_<id>/<filename>
    return "user_{0}/{1}".format(instance.user.id, filename)


class PreprocessData(models.Model):
    id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
    user = models.ForeignKey(User, on_delete=models.CASCADE)
    raw = models.FileField(upload_to=user_directory_path)
    result = models.CharField(max_length=255)

views.py (extra context on serializer)

class PreprocessCalculate(views.APIView):
    def post(self, request, format=None):
        serializer = PreprocessDataSerializer(
            data=request.data, context={"method": request.method}
        )
        serializer.is_valid(raise_exception=True)
        serializer.save()
        return Response(serializer.data)

    def put(self, request, format=None):
        serializer = PreprocessDataSerializer(
            data=request.data, context={"method": request.method}
        )
        serializer.is_valid(raise_exception=True)
        serializer.save()
        return Response(serializer.data)

processors.py

def post_pre_process(file):
    data = pd.read_csv(file, header=None)
    calculation = data.to_numpy().sum()
    return calculation

def put_pre_process(file):
    data = pd.read_csv(file, header=None)
    calculation = data.to_numpy().prod()
    return calculation

serializers.py

class PreprocessDataSerializer(serializers.ModelSerializer):

    class Meta:
        model = PreprocessData
        fields = ['user', 'raw', 'result']
        extra_kwargs = {
            'result': {'read_only': True}
        }
    
    def validate(self, attrs):
        """Only .csv files are accepted"""
        filename, ext = attrs["raw"].name.split(".")
        if not ext == "csv":
            raise serializers.ValidationError({"msg": "File must be .csv"})

        return super().validate(attrs)
    
    def create(self, validated_data):
        method = self.context.get('method')
        if method == 'POST':
            validated_data['result'] = post_pre_process(validated_data['raw'])
        if method == 'PUT':
            validated_data['result'] = put_pre_process(validated_data['raw'])
        return super().create(validated_data)

About the other Questions:

when I use the above code. In db "preprocessed_field" is still null... what is the problem in custom field serializer?

In your code you return a object not a string and there is also value.test_func() that is nowhere to be seen, if it would return a string then your .to_representation should return ret['result'].

I choose "PUT" method to calculate raw file, but I think if I use "PUT" it has a problem to change my "uuid" by mistake. I think it is not good .. then should I use GET or POST? to make calculation restAPI? If "PUT" is right how to keep idempotent my db?

As I stated POST and PUT are okay. Also, uuid will not change if editable=False. Lastly, the result is always the same, for both kind of requests. For instance, suppose a .csv files contains.

1,2,3,4,5

then, POST will always return

{
    "user": 1,
    "raw": "/media/user_1/myfile_ciacf5j.csv",
    "result": "15"
}

in the same way PUT

{
    "user": 1,
    "raw": "/media/user_1/myfile_bljuHte.csv",
    "result": "120"
}
Niko
  • 3,012
  • 2
  • 8
  • 14