-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.html
226 lines (221 loc) · 11.5 KB
/
README.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
<!DOCTYPE html>
<html>
<head>
<title>FileUtil Build Status NuGet version</title>
</head>
<body>
<h1 id="fileutil-build-status-nuget-version">FileUtil <a href="https://travis-ci.org/NinjaRocks/FileUtil.Core"><img src="https://travis-ci.org/NinjaRocks/FileUtil.Core.svg?branch=master" alt="Build Status" /></a> <a href="https://badge.fury.io/nu/FixedWidth.FileParser"><img src="https://badge.fury.io/nu/FixedWidth.FileParser.svg" alt="NuGet version" /></a></h1>
<p>.Net Library to read from fixed width or delimiter separated file using strongly typed objects.</p>
<hr />
<h2 id="fixed-width-or-delimiter-separated-file"><strong>Fixed Width or Delimiter Separated File</strong></h2>
<blockquote>
<p>What is Fixed width or Delimiter separated text files?</p>
</blockquote>
<p>Fixed width or Delimiter separeted text file is a file that has a specific format which allows for the manipulation of textual information in an organized fashion.<br />
Each row contains one record of information; each record can contain multiple pieces of data fields or columns. The data columns are separated by any character you specify called the delimiter. All rows in the file follow a consistent format and should be with the same number of data columns. Data columns could be empty with no value.</p>
<p><strong>CASE 1 :</strong> Simple pipe '|' separated Delimeter File is shown below (this could even be comma ',' separated CSV)</p>
<pre><code>|Mr|Jack Marias|Male|London|Active|||
|Dr|Bony Stringer|Male|New Jersey|Active||Paid|
|Mrs|Mary Ward|Female||Active|||
|Mr|Robert Webb|||Active|||
</code></pre>
<p><strong>CASE 2:</strong> The above file could have a header and a footer.
In which case, each row has an identifier called as Line head to determine the type of row in the file.</p>
<pre><code>|H|Department|Jun 23 2016 7:01PM|
|D||Jack Marias|Male|London|Active|||
|D|Dr|Bony Stringer|Male|New Jersey|Active||Paid|
|D|Mrs|Mary Ward|Female||Active|||
|D|Mr|Robert Webb|||Active|||
|F|4 Records|
</code></pre>
<p><strong>FileUtil</strong> can be used to parse both of the shown formats above. The line heads and data column delimiters (separators) are configurable as required per use case.</p>
<hr />
<h2 id="using-fileutil">Using FileUtil</h2>
<h2 id="case-1-let-us-see-how-we-can-parse-a-delimiter-separted-file-with-no-header-and-footer-rows"><strong>Case 1:</strong> Let us see how we can parse a delimiter separted file with no header and footer rows.</h2>
<p><strong>Note:</strong> This file will have rows with no line head as all lines are of same type
<em>ie. default is type of data.</em></p>
<blockquote>
<p>For the example rows below, we can parse the file with just few lines of code and minimal configuration.</p>
</blockquote>
<pre><code>|Mr|Jack Marias|Male|London|
|Dr|Bony Stringer|Male|New Jersey|
|Mrs|Mary Ward|Female||
|Mr|Robert Webb|||
</code></pre>
<blockquote>
</blockquote>
<h2 id="configuration"><strong>Configuration</strong></h2>
<p>We can configure the parser and provider setting for FileUtil to override the defaults.</p>
<p>The below configuration shows the complete set of attributes required to parse a delimiter separated file with no header or footer rows.</p>
<p>At a minimal you can specify the folder location <code>{"providerSettings":{ "folderPath":"C:work"}</code> where the source file will reside. You can override defaults as required.</p>
<p>By default, the file is read using the default file system provider. You can pass in your own implementation of the provider > to custom read file from desired location. We will later show how you can acheive this.</p>
<pre><code>{
"configSettings":{
"parserSettings": { "delimiter":{ "value":"|"} },
"providerSettings": {
"folderPath":"C:work",
"fileNameFormat":"File*.txt",
"archiveUponRead":"true",
"archiveFolder":"Archived"
}
}
</code></pre>
<p>}</p>
<p>The <code>"fileNameFormat":"File*.txt"</code> attribute (by default is empty) can
be used to search the folder location for files with a specific name pattern.
if not specified then all available files in the folder location will be searched.</p>
<h2 id="code"><strong>Code</strong></h2>
<p>To parse a row into a C# class, you need to implement <strong>FileLine</strong> abstract class.
By doing this you create a strongly typed line representation for each row in the file.</p>
<p>Consider the file below -</p>
<blockquote>
</blockquote>
<pre><code>|Mr|Jack Marias|Male|London|
|Dr|Bony Stringer|Male|New Jersey|
|Mrs|Mary Ward|Female||
|Mr|Robert Webb|||
</code></pre>
<blockquote>
</blockquote>
<p>Let us create an employee class which will hold data for each row shown in the file above. The properties in the line class should match to the column index and data type of the fields of the row.</p>
<p>We use the column attribute to specify the column index and can optionally specify a default value for the associated column should it be be empty. As a rule of thumb, the number of properties with column attributes should match the number of columns in the row else the engine will throw an exception.</p>
<p>FileLine base class has an index property that holds the index of the parsed line relative to the whole file, an array of errors (should there be any column parsing failures) and type property to denote if the file is of type header, data or footer (default is data)</p>
<pre><code> public class Employee : FileLine
{
[Column(0)]
public string Title { get; set; }
[Column(1)]
public string Name { get; set; }
[Column(2)]
public string Sex { get; set; }
[Column(3, "London")]
public string Location { get; set; }
}
</code></pre>
<p>Once you have created the line class it is as simple as calling the Engine.GetFile() method as follows</p>
<blockquote>
</blockquote>
<p><code>var files = new Engine<Employee>(configSettings).GetFiles();</code></p>
<p>The engine will parse the files found at the specified folder location and return a collection of
<code>File<Employee></code> objects ie. one for each file parsed with an array of strongly typed lines (in this case Employee[]).</p>
<pre><code>public class File<T> where T: FileLine
{
/// <summary>
/// File meta data.
/// </summary>
public FileMeta FileMeta { get; set; }
/// <summary>
/// Strongly typed parsed lines.
/// </summary>
public T[] Data { get; set; }
}
</code></pre>
<hr />
<h2 id="case-2-let-us-see-how-we-can-parse-a-delimiter-separted-file-with-header-and-footer-rows"><strong>Case 2:</strong> Let us see how we can parse a delimiter separted file with header and footer rows.</h2>
<p><strong>Note:</strong> This file will have rows with line head to determine each row type. By default, the line heads are 'H' for header, 'D' for data and 'F' for footer respectively. all these line heads are configurable via the config.</p>
<blockquote>
</blockquote>
<pre><code>|H|Department|Jun 23 2016 7:01PM|
|D|Mr|Jack Marias|Male|London|
|D|Dr|Bony Stringer|Male|New Jersey|
|D|Mrs|Mary Ward|Female||
|D|Mr|Robert Webb|||
|F|4 Records|
</code></pre>
<blockquote>
</blockquote>
<h2 id="configuration-1"><strong>Configuration</strong></h2>
<p>The configuration is the same as before. We can override the default line heads by specifying the required line head attribute in the parser settings.</p>
<pre><code>{
"configSettings":{
"parserSettings":{ "delimiter": { "value":"|"} ,
"lineHeaders": {
"header":"H",
"footer":"F",
"data":"D"
}
},
"providerSettings":{
"folderPath":"C:work",
"fileNameFormat":"File*.txt",
"archiveUponRead":"true",
"archiveFolder":"Archived"
}
}
}
</code></pre>
<h2 id="code-1"><strong>Code</strong></h2>
<p>Like before we need a line class to map to each type of the row in the file
ie one for the header, footer and data line respectively.</p>
<p>We continue by creating two extra classes HeaderLine and FooterLine as follows.</p>
<pre><code> public class HeaderLine : FileLine
{
[Column(0)]
public string Name { get; set; }
[Column(1)]
public DateTime Date { get; set; }
}
public class Footer : FileLine
{
[Column(0)]
public string FileRemarks { get; set; }
}
</code></pre>
<p>To parse the file you call the GetFiles() Method as follows -</p>
<blockquote>
</blockquote>
<p><code>var files = new Engine<HeaderLine, Employee, Footer>(configSettings).GetFiles();</code></p>
<p>The engine will parse the files and return a collection of <code>File<HeaderLine, Employee, Footer></code> objects
ie. one for each file parsed with strongly typed header, footer and data line arrays.</p>
<hr />
<h2 id="custom-file-provider-let-us-see-how-we-can-implement-custom-file-provider-for-bespoke-requirements"><strong>Custom File Provider:</strong> Let us see how we can implement custom file provider for bespoke requirements.</h2>
<p>You can implement your own custom provider and pass it to the engine to provide delimiter separated file lines by implementing your own bespoke logic. This could be reading the contents from the datadase or over http, etc.</p>
<p>To implement a custom provider you need to implement <strong>IFileProvide</strong> interface</p>
<p>An example dummy implementation is as follows</p>
<pre><code> public class CustomProvider : IFileProvider
{
public FileMeta[] GetFiles()
{
// custom implementation to return file lines
return new[]
{
new FileMeta
{
FileName = "Name",
FileSize = 100,
FilePath = "File Path",
Lines = new[] {"H|22-10-2016|Employee Status", "D|John Walsh|456RT4|True", "F|1"}
}
};
}
}
</code></pre>
<p>You can pass the custom provider to the engine as follows -</p>
<p><code>var files = new Engine<Employee>(configSettings, new CustomProvider()).GetFiles();</code></p>
<p><code>var files = new Engine<HeaderLine, Employee, Footer>(configSettings, new CustomProvider()).GetFiles();</code></p>
<p>Returns</p>
<pre><code>public class File<TH, TD, TF>
{
/// <summary>
/// File meta data.
/// </summary>
public FileMeta FileMeta { get; set; }
/// <summary>
/// Parsed header lines.
/// </summary>
public TH[] Headers { get; set; }
/// <summary>
/// Parsed data lines.
/// </summary>
public TD[] Data { get; set; }
/// <summary>
/// Parsed footer lines.
/// </summary>
public TF[] Footers { get; set; }
}
</code></pre>
<hr />
<h2 id="credits"><strong>Credits</strong></h2>
<p>Thank you for reading. Please fork, explore, contribute and report. Happy Coding !! :)</p>
</body>
</html>