2021-12-21

Fast way to format and save a numpy array of x, y, z coordinates to a text file

I need to write a large number of vertices to a text file in a specific format (.obj wavefront). So I was testing out approaches.

import numpy as np

def write_test(vertices, file_path, overwrite=True):
    """loop through each vertex, format and write"""
    if overwrite:
        with open(file_path, 'w') as obj_file:
            obj_file.write('')
    with open(file_path, 'a') as test_file:
        for v in vertices:
            test_file.write('v %s %s %s\n' % (v[0], v[1], v[2]))


def write_test2(vertices, file_path, overwrite=True):
    """use np.savetxt"""
    if overwrite:
        with open(file_path, 'w') as obj_file:
            obj_file.write('')
    with open(file_path, 'a') as test_file:
        np.savetxt(test_file, vertices, 'v %s %s %s\n', delimiter='', newline='')


def write_test3(vertices, file_path, overwrite=True):
    """avoid writing in a loop by creating a template for the entire array, and format at once"""
    if overwrite:
        with open(file_path, 'w') as obj_file:
            obj_file.write('')
    with open(file_path, 'a') as test_file:
        temp = 'v %s %s %s\n' * len(vertices)
        test_file.write(temp % tuple(vertices.ravel()))


def write_test4(vertices, file_path, overwrite=True):
    """write only once, use join to concatenate string in memory"""
    if overwrite:
        with open(file_path, 'w') as obj_file:
            obj_file.write('')
    with open(file_path, 'a') as test_file:
        test_file.write('v ' + '\nv '.join(' '.join(map(str, v)) for v in vertices))

As it turns out, to my surprise write_test is faster then write_test2, with write_test3 being the fastest one

In [2]: a=np.random.normal(0, 1, (1234567, 3))

In [3]: %timeit write_test(a, 'test.obj')
2.6 s ± 94.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [4]: %timeit write_test2(a, 'test.obj')
3.6 s ± 30 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [5]: %timeit write_test3(a, 'test.obj')
2.23 s ± 7.29 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [6]: %timeit write_test4(a, 'test.obj')
3.49 s ± 19.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Currently, writing to text file is the bottle neck in my vectorized code.

Looking at the np.savetxt code as rchome suggested savetxt seems to be doing a lot of generalized formatting work, and is probably looping in python anyway, so no wonder it is slower then the simple python loop in write_test.

So my question now is that is there any faster way to accomplish this?



from Recent Questions - Stack Overflow https://ift.tt/3F68Gxg
https://ift.tt/eA8V8J

No comments:

Post a Comment