我有一个网页,用于向服务器提交CSV文件。 我必须验证文件,例如正确的列数,正确的数据类型,交叉字段验证,数据范围验证等。最后要么显示成功的消息,要么返回带有错误消息和行号的CSV。
目前,每个行和每个列都循环通过以查找CSV文件中的所有错误。 但是对于较大的文件来说它变得非常慢,有时会导致服务器超时。 有人可以建议一个更好的方法来做到这一点。
谢谢
I have a webpage that is used to submit a CSV file to the server. I have to validate the file, for stuff like correct number of columns, correct data type, cross field validations, data-range validations, etc. And finally either show a successful message or return a CSV with error messages and line numbers.
Currently every row and every column is looped through to find out all the errors in the CSV file. But it becomes very slow for bigger files, sometimes resulting in a server time-out. Can someone please suggest a better way to do this.
Thanks
最满意答案
我会建议一种基于规则的方法,类似于单元测试。 想想每一个! 可能出现的错误,并在增加抽象级别时对它们进行排序
正确的文件编码 正确的行数/列数 正确的列标题 正确的数字/文本/日期格式 正确的数字范围 商业规则?? ...这些规则也可以有自动修复。 因此,如果您可以自动检测编码,则可以在测试所有规则之前纠正它。
可以使用命令模式完成实现
public abstract class RuleBase { public abstract bool Test(); public virtual bool CanCorrect() { return false; } }然后为要进行的每个测试创建一个子类,并将它们放在一个列表中。
通过仅使用后台线程来测试传入文件可以克服超时。 用户必须等到他的文件被验证并变为“活动”。 完成后,您可以将他转发到下一页。
I would suggest a rule based approach, similar to unit tests. Think of every! error that can possibly occour and order them in increasing abstraction level
Correct file encoding Correct number of lines/columns correct column headers correct number/text/date formats correct number ranges bussiness rules?? ...These rules could also have automatic fixes. So if you could automatically detect the encoding, you could correct it before testing all the rules.
Implementation could be done using the command pattern
public abstract class RuleBase { public abstract bool Test(); public virtual bool CanCorrect() { return false; } }Then create a subclass for each test you want to make and put them in a list.
The timeout can be overcome by using a background thread only for test incoming files. The user has to wait till his file is validated and becomes "active". When finished you can forward him to the next page.
验证CSV文件(Validate CSV file)我有一个网页,用于向服务器提交CSV文件。 我必须验证文件,例如正确的列数,正确的数据类型,交叉字段验证,数据范围验证等。最后要么显示成功的消息,要么返回带有错误消息和行号的CSV。
目前,每个行和每个列都循环通过以查找CSV文件中的所有错误。 但是对于较大的文件来说它变得非常慢,有时会导致服务器超时。 有人可以建议一个更好的方法来做到这一点。
谢谢
I have a webpage that is used to submit a CSV file to the server. I have to validate the file, for stuff like correct number of columns, correct data type, cross field validations, data-range validations, etc. And finally either show a successful message or return a CSV with error messages and line numbers.
Currently every row and every column is looped through to find out all the errors in the CSV file. But it becomes very slow for bigger files, sometimes resulting in a server time-out. Can someone please suggest a better way to do this.
Thanks
最满意答案
我会建议一种基于规则的方法,类似于单元测试。 想想每一个! 可能出现的错误,并在增加抽象级别时对它们进行排序
正确的文件编码 正确的行数/列数 正确的列标题 正确的数字/文本/日期格式 正确的数字范围 商业规则?? ...这些规则也可以有自动修复。 因此,如果您可以自动检测编码,则可以在测试所有规则之前纠正它。
可以使用命令模式完成实现
public abstract class RuleBase { public abstract bool Test(); public virtual bool CanCorrect() { return false; } }然后为要进行的每个测试创建一个子类,并将它们放在一个列表中。
通过仅使用后台线程来测试传入文件可以克服超时。 用户必须等到他的文件被验证并变为“活动”。 完成后,您可以将他转发到下一页。
I would suggest a rule based approach, similar to unit tests. Think of every! error that can possibly occour and order them in increasing abstraction level
Correct file encoding Correct number of lines/columns correct column headers correct number/text/date formats correct number ranges bussiness rules?? ...These rules could also have automatic fixes. So if you could automatically detect the encoding, you could correct it before testing all the rules.
Implementation could be done using the command pattern
public abstract class RuleBase { public abstract bool Test(); public virtual bool CanCorrect() { return false; } }Then create a subclass for each test you want to make and put them in a list.
The timeout can be overcome by using a background thread only for test incoming files. The user has to wait till his file is validated and becomes "active". When finished you can forward him to the next page.
发布评论