I am trying to analyze very large text string in Python containing nvidia-smi outputs but I really want to spend more time analyzing the data than working on my regex skills. I got the regex as follows but it takes forever in some rows (it might be the variation of input data in some rows), but I thought maybe my regex pattern is very compute-intensive as well.
extracted_line1 = r'[=]*[+][=]*[+][=]*\|\n\|(\s+(.*?)\|)+\n\|(\s+(.*?)\|)(\s+(.*?)\|)(\s+(.*?)\|)\n\|'
This pattern matches the third row in the table.
This one down below ⬇️
===============================+======================+======================|
| 0 GeForce GTX 1080 On | 00000000:04:00.0 Off | N/A |
| 27% 20C P8 6W / 180W | 2MiB / 8119MiB | 0% E. Process |
| | | N/A |
It works for most rows but randomly hangs for some rows. What would be a more simplified version of this regex expression? Or maybe a better question is what is the best approach to grab each of the values in this table for every row (corresponding metrics for each GPU)?
Truncated input string is here
... bunch of text
nvidia-smi:
Tue Jun 8 15:00:02 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.80 Driver Version: 460.80 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 On | 00000000:04:00.0 Off | N/A |
| 27% 20C P8 6W / 180W | 2MiB / 8119MiB | 0% E. Process |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 1080 On | 00000000:05:00.0 Off | N/A |
| 27% 23C P8 6W / 180W | 2MiB / 8119MiB | 0% E. Process |
| | | N/A |
+-------------------------------+----------------------+----------------------+
... bunch of text
P.S I am trying to extract the following values
gpu_index = [processed result of regex output here]
gpu_model_name = [processed result of regex output here]
persistance_mode = [processed result of regex output here]
bus_id = [processed result of regex output here]
display_active = [processed result of regex output here]
volatile_ecc = [processed result of regex output here]
fan = [processed result of regex output here]
temperature = [processed result of regex output here]
perf = [processed result of regex output here]
power_usage = [processed result of regex output here]
max_power = [processed result of regex output here]
memory_usage = [processed result of regex output here]
available_mem = [processed result of regex output here]
gpu_utilization = [processed result of regex output here]
compute_mode = [processed result of regex output here]
multiple_instance_gpu_mode = [processed result of regex output here]