Part of setting up a data source that reads a text file using a file-based protocol (such as FTP or Dropbox) is choosing the filename/s that will be a "match" for the data source. For any particular group of files, which ones do you want to select to have their records parsed by the text parser and ultimately ingested as time-series data for the parameters in your data source?
This choice is made during the data source creation wizard (although the configuration can be edited later). In this example, I'm setting up a new data source to read from my Dropbox folder, and I've already completed step 2 of the data source wizard. After clicking "Next", I'm expected to choose sample data, which will be used to configure the text parser in the subsequent step. My sample data must be chosen from the contents of my Dropbox folder, which has 3 files:
Note that the "File name match" text box starts out blank, which means all 3 files in my folder will be displayed. Now I can start typing a file match into the text box, but in this case I only want to match my good file and avoid my two bad files, so it will be easier to simply click on goodfile.txt.
This has two effects: the "File name match" text box is now populated with the name of the file I clicked, and the list of files in my Dropbox folder is now limited to only those files which match:
Note the dialog now includes a message "Matching 1 / 3 files", indicating the results of applying the match.
This may seem like the end of the problem; the good file will be matched when data is acquired from this Dropbox folder, while the bad files will not be matched. However, consider this additional scenario where my Dropbox folder contains a fourth file that we definitely don't want to acquire. It's a cumulative backup of every previous good file and as such is very large:
Now the file match has a problem; it's including a file we don't want, because part of the filename is identical to the file we do want. This is an illustration of how the file match actually works by default: if any filename contains the match, either wholly or partly, then it's considered to be matching. To solve this, we have to stop using the default matching behavior, and instead use a regular expression.
A regular expression (usually abbreviated as regex) is a sequence of characters that specifies a search pattern, and they can be extremely complex at times. In this case though, a very simple regex will solve the problem. Since a regex must define a pattern, we should consider a pattern that will match our good file, but not the backup version. We can describe this in words as "find goodfile.txt but with no characters afterwards". In regex notation, the equivalent of "no characters afterwards" is the $ symbol. Therefore we will make two changes: click the box to enable regex mode, and add the $ symbol to our existing match:
This is the result we want, matching our single good file and excluding the 3 other files.
This regex example was simple, to solve a simple problem. To match more complex search patterns, you would need to use more complex regular expressions. As much trial-and-error can be involved in crafting the perfect regex, an online utility such as https://regex101.com that shows dynamic results is highly recommended. Using our previous example, this utility would show the following:
Note that I have entered the regex in the top left, and then typed two test strings below that (the names of two files that may be encountered). I have instant confirmation about which filename is matched, plus a useful explanation about what the regex is actually doing.
You too can use regexes to impress your friends, confuse your enemies, and match exactly the files you want to acquire in eagle.io. Good luck!
Comments
0 comments
Article is closed for comments.