0

I am writing an I/O intensive program in python and I need to allocate a specific amount of storage on hard disk. Since I need to be as fast as possible I do not want to make a file with zero (or dummy) content in a loop. Does python have any library or method to do so, or do I have to use a Linux command in python?

Actually, I am implementing an application that works like BitTorrent. In my code, the receiver stores every segment of the source file in a separate file (each segment of the source file comes from a random sender). At the end, all the separate files will be merged. It takes lots of time to do so.

Therefore, I want to allocate a file in advance and then write every received segment of the source file in its offset in the pre-allocated file.

def handler(self):    
    BUFFER_SIZE = 1024  # Normally 1024, but we want fast response
    # self.request is the TCP socket connected to the client
    data = self.request.recv(BUFFER_SIZE)
    addr = ..... #Some address

    details = str(data).split() 
    currentFileNum = int(details[0]) #Specifies the segment number of the received file.
    totalFileNumber = int(details[1].rstrip('\0')) # Specifies the total number of the segments that should be received.
    print '\tReceive: Connection address:', addr,'Current segment Number: ', currentFileNum, 'Total Number of file segments: ', totalFileNumber

    f = open(ServerThreadHandler.fileOutputPrefix + '_Received.%s' % currentFileNum, 'wb')
    data = self.request.recv(BUFFER_SIZE)
    while (data and data != 'EOF'):
        f.write(data)
        data = self.request.recv(BUFFER_SIZE)
    f.close()
    print "Done Receiving." ," File Number: ", currentFileNum
    self.request.sendall('\tThank you for data. File Number: ' + str(currentFileNum))
    ServerThreadHandler.counterLock.acquire()
    ServerThreadHandler.receivedFileCounter += 1
    if ServerThreadHandler.receivedFileCounter == totalFileNumber:
        infiles = []
        for i in range(0, totalFileNumber):
            infiles.append(ServerThreadHandler.fileOutputPrefix + '_Received.%s' % i)

        File_manipulation.cat_files(infiles, ServerThreadHandler.fileOutputPrefix + ServerThreadHandler.fileOutputSuffix, BUFFER_SIZE) # It concatenates the files based on their segment numbers. 
    ServerThreadHandler.counterLock.release()
Mahyar Hosseini
  • 131
  • 4
  • 12
  • Your question is unclear, you can make it clear by adding some code to it! – Mazdak Aug 21 '15 at 21:38
  • 1
    your question is duplicate : http://stackoverflow.com/questions/8816059/create-file-of-particular-size-in-python – Iman Mirzadeh Aug 21 '15 at 21:43
  • @imanMirzadeh I have read that post but that solution may give the result that you might not expect. And the other solution which uses "truncate" method is for NTFS file systems. – Mahyar Hosseini Aug 21 '15 at 22:21
  • Maybe [this answer](http://stackoverflow.com/a/139289/892493) or [this answer](http://stackoverflow.com/a/8706714/892493)? – drew010 Aug 21 '15 at 22:56
  • I edited the question and explained the details. The code is simple. It opens lots of files, writes in them and finally merges all of them based on their segment numbers. It works but it is so slow. @Kasramvd – Mahyar Hosseini Aug 21 '15 at 23:22

1 Answers1

0

Generally (not only in Python but on the OS level) modern FS drivers support sparse files when you pre-create an apparently zero-filled file and then perform seek-and-write cycles to a point where you need to write a particular bit of data.

See How to create a file with file holes? to understand how to create such a file.

Community
  • 1
  • 1
user3159253
  • 16,836
  • 3
  • 30
  • 56