0

I'm trying to create a "toy" database engine and I just stuck at the file writing... This is my code so far:

dbengine.hpp:

#pragma once

#include <iostream>
#include <fstream>
#include <exception>
#include <stdio.h>
#include <string.h>
#include <errno.h>

template<class T>
class PointerHandler{
    public:
        class ChainLink{
            public:
                    /**
                     * @brief Construct a new ChainLink object.
                     * 
                     * @param cellCount Count of the cells that will be hold by the block.
                     */
                    ChainLink(long cellCount) : _cellCount(cellCount){
                        _block = (T*) malloc(sizeof(T)*cellCount);
                    };
                    virtual ~ChainLink(){
                        free(_block);
                    };
                    /**
                     * @brief Get the value of a cell in the block.
                     * 
                     * @param cellNumber The number of the cell we want to read out.
                     * @return T Type of the cell value.
                     */
                    T getCellValue(long cellNumber){
                        try{
                            return _block[cellNumber];
                        }
                        catch( std::exception& e){
                            std::cout << "Problem with getting value: " << e.what() << std::endl;
                            return T();
                        };
                    };
                    /**
                     * @brief Set the value of the given cell.
                     * 
                     * @param cellNumber The number of the cell which value we want to set.
                     * @param value The value what we want to set to the cell.
                     */
                    void setCellValue(long cellNumber, T value){
                        if (cellNumber < 0 || cellNumber >= _cellCount){
                            std::cout << "Problem with setting value: not valid cellNumber" << std::endl;
                            return;
                        }
                        _block[cellNumber] = value;
                    };
                    /**
                     * @brief Get how many cells we have in the block.
                     * 
                     * @return long The cell count in the block.
                     */
                    long getCellCount(){
                        return _cellCount;
                    };
                    /**
                     * @brief Set the blockID.
                     * 
                     * @param ID The ID of the block.
                     */
                    void setBlockID(long ID){
                        _blockID = ID;
                    };
                    /**
                     * @brief Get the ID of the block.
                     * 
                     * @return long The ID of the block.
                     */
                    long getBlockID(){
                        return _blockID; 
                    };
                    /***-----------------------------------------------***/
                    T *getBlock(){
                        return _block;
                    };
                    /***-----------------------------------------------***/
                    /**
                     * @brief Pointer to the next chainlink in the chain.
                     * 
                     */
                    ChainLink *next;
                    /**
                     * @brief Pointer to the previous chainlink in the chain.
                     * 
                     */
                    ChainLink *prev;
            private:
                    /**
                     * @brief Cellcount in the block.
                     * 
                     */
                    long _cellCount;
                    /**
                     * @brief The ID of the block of cells. Basically the first cell number in the block.
                     * 
                     */
                    long _blockID;
                    /**
                     * @brief The pointer of the block.
                     * 
                     */
                    T *_block;     
        };
        /**
         * @brief Construct a new PointerHandler object.
         * 
         * @param chainLinkCount Maximum count of the chainlinks (blocks).
         * @param cellCount Maximum count of the cells in a block.
         */
        PointerHandler(std::string vmFileName, long chainLinkCount = 1024, long cellCount = 1024) : _chainLinkCount(chainLinkCount), _cellCount(cellCount), _vmFileName(vmFileName){
            
            _first = new ChainLink(cellCount); // Creating the first chainlink
            _first->prev = nullptr; // We need to pointing it's prev to nullptr
            ChainLink *current = _first; // Create a temporary pointer to help the other steps
            int i = 1; // Counter for the process

            while(i<chainLinkCount){ // If we reach the last number of the count
                ChainLink *temp = new ChainLink(cellCount); // Create new chainlink
                temp->setBlockID(i*cellCount); // Set the chainlinks blockID
                temp->prev = current; // Set the previous pointer to the currently selected chainlink

                current->next = temp; // Set the currently selected chainlink's next pointer to the created chainlink
                current = current->next; // Select the created chainlink

                i++; // Add one to the counter
            };

            current->next = nullptr; // If we reach the last chainlink, we set it's next pointer to nullptr
            _last = current; // And the lastly created chainlink for the last chainlink

            _file.open (_vmFileName, std::ios::in | std::ios::out | std::ios::binary); // Opening buffer file
            if(!_file.is_open()){ // If file does not exists
                // DEBUG THINGS START
                std::cerr << "File does not exist: " << _vmFileName << std::endl;
                // DEBUG THINGS END
                _file.open (_vmFileName, std::ios::in | std::ios::out | std::ios::binary | std::fstream::trunc); // We create a new one
            }
            if (!_file.is_open() || _file.fail()) // If we can't read the file
            {
                std::cerr << "Can't read file..." << std::endl;
                throw;
            }

            initBlocks(); // Then initialize the blocks

        };
        /**
         * @brief Destroy the PointerHandler object.
         * 
         */
        ~PointerHandler(){

            saveBlocks(); // First we save the informations in the buffer

            ChainLink *current = _first; // Set up a current pointer

            while(current->next!=nullptr){ // Exit when the next one is nullptr
                current = current->next; // Step to the next
                delete current->prev; // Delete the previous
            };
            delete _last; // Delete the last one as well

            _file.close(); // Close the buffer file
        };
        /**
         * @brief Get the value of the given cell.
         * 
         * @param cellNumber The number of the cell we want to get.
         * @return T The value we get.
         */
        T getCell(long cellNumber){
            ChainLink *current = _first; // Set up a current pointer

            while(current!=nullptr){ // Exit when the current one is nullptr
                if(current->getBlockID()==(cellNumber/_cellCount)*_cellCount){ // If the chainlink's blockID is the same as we are searching for
                    return current->getCellValue(cellNumber%_cellCount); // Then we return with the value of the cell we looking for
                };
                current = current->next; // If we don't find it here, we go to the next chainlink
            };

            // NEED TO IMPLEMENT THE FILE SEARCH HERE!!!

            std::cout << "Can't find cellNumber: " << cellNumber << std::endl;
            return T(); // If we can't find it all, we go back with an empty value
        };
        /**
         * @brief Set the value of the given cell.
         * 
         * @param cellNumber The number of the cell we want to set.
         * @param value The value, we want to set to the cell.
         */
        void setCell(long cellNumber, T value){
           ChainLink *current = _first; // Set up a current pointer

            while(current!=nullptr){ // Exit when the current one is nullptr
                if(current->getBlockID()==(cellNumber/_cellCount)*_cellCount){ // If the chainlink's blockID is the same as we are searching for
                    current->setCellValue(cellNumber%_cellCount, value); // Then we set the value of the cell we looking for
                    return; // And the return
                };
                current = current->next; // If we don't find it here, we go to the next chainlink
            };

            // NEED TO IMPLEMENT THE FILE SEARCH HERE!!
            
            std::cout << "Can't find cellNumber: " << cellNumber << std::endl;
            return; // If we can't find it all, we simply go back
        };

        void initBlocks(){
            ChainLink *X = _loadBlock(0);
        };

        void saveBlocks(){
            ChainLink *current = _first;
            do{
                _saveBlock(current);
                current = current->next;
            }while(current!=nullptr);
        };

    private:
        /**
         * @brief Counter for the number of the chainlinks.
         * 
         */
        long _chainLinkCount;
        /**
         * @brief Counter for the number of the cell in one block.
         * 
         */
        long _cellCount;
        /**
         * @brief The first chainlink pointer.
         * 
         */
        ChainLink *_first;
        /**
         * @brief The last chainlink pointer.
         * 
         */
        ChainLink *_last;
        /**
         * @brief Holds the path of the buffer file.
         * 
         */
        std::string _vmFileName;
        /**
         * @brief File stream for buffer file. Opening at the construction closing at destruction.
         * 
         */
        std::fstream _file;

        ChainLink *_loadBlock(long cellNumber){
            _file.seekp(((cellNumber/_cellCount)*_cellCount)*sizeof(T), std::ios::beg); // Let's find out, where we need to start the read
            if(_file.fail()){ // If we can't find a block like this
                return nullptr;
            }

            ChainLink *newCL = new ChainLink(_cellCount); // Create a new chainlink, that we can send back
            newCL->setBlockID(((cellNumber/_cellCount)*_cellCount)); // Set the blockID
            _file.read(reinterpret_cast<char*>(newCL->getBlock()), _cellCount*sizeof(T)); // Read the things
            if(_file.fail()){ // If we can't find a block like this
                std::cerr << "Can't read from the file: " << strerror(errno) << std::endl;
                delete newCL; // If shit happens we need to purge memory from the new chainlink
                return nullptr;
            };
            return newCL; // Return with the newly created chainlink
        };

        int _saveBlock(ChainLink *CLToWrite){
            _file.seekp(CLToWrite->getBlockID()*sizeof(T), std::ios::beg); // Go to the position where we want to write the info
            _file.write(reinterpret_cast<char*>(CLToWrite->getBlock()), _cellCount*sizeof(T)); // Write the content of the block to the position
            if (_file.fail()) { // If we can't write the file
                std::cerr << "Can't write to the file: " << strerror(errno) << std::endl;
                return 1;
            }
            return 0;
        };
};

main.cpp:

#include <iostream>
#include <fstream>

#include "includes/dbengine.hpp"


int main(int argc, char **argv){

        PointerHandler<std::string> ph("strings.dbb", 1, 8);
        ph.setCell(0, "value00");
        ph.setCell(1, "value01");
        ph.setCell(2, "value02");
        ph.setCell(3, "value03");
        ph.setCell(4, "value04");
        ph.setCell(5, "value05");
        ph.setCell(6, "value06");
        ph.setCell(7, "value07");
        return 0;
};

The idea is that I'm gonna read in "blocks" of data from a binary file into an linked list, when it's needed. The problem is when I want to read or write into the file I get "Can't write to the file: Success" message and the file is emtpy after close (so basically the errno value is 0). When I want to read at the first time, it's ok, because the file is empty, so I can't read anything, this should work after the first time when I wrote something into the file so I know the reason at read. But I just can't understand why the program can't write anything into the file.

I've been created a simple program that just write a struct into the binary file just to test the environment (Ubuntu, code-server) and it's works, so there are no problem with permissions or anything like this.

What did I do wrong?

bradacsa
  • 19
  • 3
  • Did you try writing code to read/write binary files with a simpler data type? – Beta Mar 26 '23 at 01:09
  • 3
    It appears to me that the shown code uses `malloc` to allocate something which will store an array of `std::string`s. `PointerHandler`, `T` is `std::string`, and it's getting `malloc`ed., `_block = (T*) malloc(sizeof(T)*cellCount);` Strike 1. Then the shown code bravely attempts to `write()` `N*sizeof(std::string)` into a file directly. Strike 2. I didn't want to get called out, so I stopped at that point. Whatever's actually wrong with the writing process is the least of the issues here. Everything here is fundamentally wrong. – Sam Varshavchik Mar 26 '23 at 01:27
  • 1
    *What did I do wrong?* -- What did you did wrong is what countless others have done wrong, and that is to attempt "binary file reading/writing" without the knowledge and/or being told that C++ has types that cannot be written or read from files this way. If the type you are writing cannot pass this test `std::is_trivially_copyable();`, then it cannot be used as you are using it now. It needs to be *serialized* properly. As mentioned by @SamVarshavchik, your code is fundamentally broken, so it would be pointless to tell you what is wrong, except on a high level. – PaulMcKenzie Mar 26 '23 at 03:07
  • "What did you did wrong is what countless others have done wrong, and that is to attempt "binary file reading/writing" without the knowledge and/or being told that C++ has types that cannot be written or read from files this way." Thanks. Now, any knowledge base or link you suggest to start with? I'm learning C++ by myself, maybe that's why nobody told me these things, but I'm ready to learn... – bradacsa Mar 26 '23 at 11:24
  • 1
    Can you explain how you're "learning C++ by myself", exactly? Are you following along an organized, structured, curriculum from a C++ textbook, one chapter at a time, reading each chapter and trying its excersizes? What is the topic of the chapter in your textbook that this practice problem is from? Or, are you trying to learn the hardest and most complicated general purpose programming language in use today by running keyword searches, reading blogs, and whatever random stuff comes back from a search query? Sorry, you won't learn C++ this way, it's just too complex. – Sam Varshavchik Mar 26 '23 at 11:42
  • 1
    @bradacsa -- The simplest thing to do is to make sure that template is not instantiated for types that are not trivially copyable. C++ 20 concepts, or if not that `std::enable_if` in the template argument would be what you would use. That would cover most cases, but there are still types that have member variables that are pointers, and *are* trivially copyable. But a pointer value makes no sense if written to a file. Do a web search for "serialization library C++" to find such libraries that do serialize data properly. – PaulMcKenzie Mar 26 '23 at 12:28
  • 1
    BTW, objects in C++ must be properly constructed before they can be considered objects. Using `malloc` as you have done does *not* create objects -- all it does is allocate `N` bytes, nothing more, nothing less. Your class then pretends it instantiated (in this case) an array of N `std::string` objects, when it didn't do this at all. So that's why the code is fundamentally broken from the start. See [this link](/questions/234724/is-it-possible-to-serialize-and-deserialize-a-class-in-c) as to how to serialize data (I may close as a duplicate). – PaulMcKenzie Mar 26 '23 at 12:36
  • @SamVarshavchik - Both. Basically I watched a few courses on youtube like "FreeCodeCamp" and I was looking for universities websites like https://infocpp.iit.bme.hu/tananyag (it's in my native language). And yes, random stuff as well, like TheCherno's youtube channel, stackoverflow searches, chatGPT. But I just can't find any course that can explain things much deeper. – bradacsa Mar 26 '23 at 12:49
  • 1
    @bradacsa No problem. The bottom line is that your current code may work perfectly if T is an int, double, float, or a struct like `struct s { int x; int y; char z; int abc[10]; };`, since those are trivially copyable. Try it and you may see that your code does work. The problem is for the types I mentioned previously (non trivially-copyable types, like `std::string`), and that is where you hit the brick wall. A different strategy is then needed. – PaulMcKenzie Mar 26 '23 at 13:03
  • None of the things you mentioned works as a replacement for a C++ textbook. – Sam Varshavchik Mar 26 '23 at 14:59
  • @SamVarshavchik And which one you recommend to buy/use? – bradacsa Mar 26 '23 at 18:50
  • See [Stackoverflow's list of recommended C++ textbooks](https://stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list). – Sam Varshavchik Mar 26 '23 at 22:33
  • @SamVarshavchik Thanks a lot, I'll definetly check them out! – bradacsa Mar 27 '23 at 07:56

0 Answers0