Out of memory exception using Newtonsoft.Json package

Newtonsoft.Json package is probably one of the most essential packages in .NET software development. For those of you, not knowing what it does. It takes care of object serialization to JSON notation and deserilaization from JSON notation. I have used this package in numerous projects since its inception and I can only say great things about it.

However, as a part of one of our product’s GDPR compliance upgrade, I encountered an interesting undocumented feature. After object serialization a w3wp.exe process running the application pool for our product, started consuming 100% CPU capacity and hogged so much memory, we experienced “Out of memory exception” in a matter of minutes.

Since our product still uses .NET 3.5 (we are planning an upgrade to 4.7.2 shortly), tasks and parallel library are not native. We are using Microsoft’s TaskParallelLibrary package to circumvent framework deficiency. Hence, at first, I was dead sure that the library was a source of this issue. Specially, as we were doing serialization in an asynchronous method. After removing creation of new task, I was surprised it was not the case.

The object we wanted to serialize was a more complex derivative of this:

The easiest way to serialize an instance of this object would be to do something like:

Except, this throws a CustomException, as Id property is not set. Newtonsoft.Json package documentation and StackOverflow answers offer a solution using serialization settings:

This works as expected. It ignores exceptions thrown by a serialized object instance. Yeay!

Not so fast. When using above code as part of web application, it will cause your application to hog all available CPU power and consume as much memory as possible. Promptly. Yikes! Surley, not something you would want in a production ready environment. Running a debugger revealed issues with this solution. Whenever a serialized object raised an exception, because of our serialization setting, exception went by unhandled. This in turn put stress on servers CPU and caused a memory leak of Mt. Everest.

The bad thing is, that at the point of my writing, there is no option, to tell JSON serialization engine to handle all exception raised and not just mark them as handled. I guess what you could do is create new properties for each property causing you a headache and then decorate it with JsonPropertyAttribute accordingly, but in our case, that would mean changing every property in an object (and there were plenty). What I ended up doing was that I converted an object to DataTable (we use it for ADO anyway) and serialized that. Worked like a charm.

Bug tracking – yes or no

Yesterday, I encountered an article on Medium titled A Better Bug Tracker by Anthony Sciamanna. The author goes to great lengths describing why bug trackers are unnecessary and point to the problem in your development workflow. Further, Mr. Sciamanna quotes Uncle Bob Martin (self proclaimed Software Craftsman):

 

“Think about what it means to use a bug tracking system. You have so many bugs you need an automated system to keep track of them.”

 

Now, far being from me, to disagree with such software developer authorities. And I do partially agree with points made in the article. You should have zero-bugs policy. Yes, you should modify your process to reduce the number of it. Yes, you should write unit tests. Yes, yes and yes. However, both gentlemen either do not know the purpose of bug trackers or they just pretend they do not in order to promote their ways. Personally, I am not sure which is worse.

First of all. Unit tests are not a solve-it-all tool. Yes, they present de-facto specifications for your code. Yes, they do make you think about possible edge-cases. Still, the test is only as good as the developer that made it. Now, I expect some will start waving at me with code coverage reports. I am sorry to tell you. I have seen a code coverage of 100% and unit tests that weren’t worth the electricity used to produce them.

Next, bug/issue trackers were made for people to log bugs, features, tasks etc that they cannot attend to at this very moment and I am pretty sure that there isn’t a single bug tracker out there that was made with an intention of encouraging developers to produce bugs.

Every developer I know, keeps some sort of log for features that need to be implemented, bugs that need to be fixed and tasks that must be performed (either in Notepad++, Excel or JIRA) and I am pretty sure the author of said article as well. The question is why do we, developers, log bugs? The answer is simple. So they don’t get lost or forgotten. Yes, I get fix bugs-first policy, but let’s say you are just fixing a bug and now a new bug report gets in. Should I stop fixing the bug I am currently working on? No. I will log the new one and continue working on my existing work.

I am glad to hear that Mr. Sciamanna and Uncle Bob Martin can hold everything they have in a queue in their head while doing continuous context-switching (or maybe, they are just not that busy). I am, sadly, not of that sort. If you tell me two things at once while I am doing something completely different, you will be lucky, if I fully remember one. Hence, I tend to write things down. And here is where bug tracker comes in handy. I use it to log ideas for new features, bugs, tasks that await me during the day, the full Monty. Sure, you can use excel spreadsheet for that, but doesn’t that spreadsheet then becomes a simple bug tracker?

Not using a bug tracker does not imply that your software doesn’t have bugs. Much like sticking your head in the sand doesn’t make your rear end invisible to innocent observers on land. It makes you look stupid, though.

What I learned last week… uh… months

It has been a long time, since I have written a post. Reasons vary. Most of it is down to my laziness and limitations to my spare time. Some of it is down to lack of motivation as well.

Anyway, during several last months I have, surprisingly, learned many new things. I limited my pick to the following items:

  1. You cannot set Prefer 32-bit option to class library in .NET
  2. ORACLE RDBMS column names must not exceed 30 characters
  3. People, suggesting that copy & paste for VPN connections must be disabled, should be “taken care of”
  4. No matter what the task is, you must take your time to solve it
  5. CSRF feature, known as warning SG0016 is annoying, if you are implementing public API
  6. How to use query string parameters in non-RESTful API
  7. FastDirectoryEnumerator!
  8. When using integration to move some data, always use separate table

Now to details.

 

You cannot set Prefer 32-bit option class library in .NET

The setting can be located in Project properties -> Build, but it is disabled for class libraries. First of all, as per this StackOverflow article, the only difference between selecting “x86”  as platform target and using “Prefer 32-bit” option is that application compiled for “x86” will fail on ARM based environment, while application compiled for “Any CPU” with “Prefer 32-bit” selected will not. My reasoning is that as executable projects are meant to define the architecture for entire application, this setting would have no meaning in class libraries. Hence, it is disabled.

 

ORACLE RDBMS column names must not exceed 30 characters

Really. But only, if you are running version 12.1 or lower. Otherwise, you can use names up to 128 characters. We found that out the hard way, while migrating MSSQL database to ORACLE platform. Anyway, you can find out what length your column and table names can be, by running following statment in your SQL client:

 

People, suggesting that copy & paste for VPN connections must be disabled, should be “taken care of”

The title says it all really. Disabling copy & paste option over VPN connection might have some security benefits and I am pretty sure that some auditor can’t sleep, if it is not, but it is annoying as hell for anybody that actually tries to use VPN connection for REAL work. Imagine you have to prepare a report for a customer that requires you to run a 300 line long SQL statement. Obviously, you are not developing that in their environment. You are doing it in your local database. Now, you just need to somehow get it to the customer system. Copy & paste seems harmless enough. Yeah, not going to happen. So now, you need a Dropbox (best case scenario) or, mailing that SQL to customer’s admin and hoping that person knows what he/she is doing. Not to mention the awkward situation, when you find out you forgot to add just one more column or condition to your SQL statement.

Kudos to all auditors, recommending ban of copy & paste. NOT.

 

No matter what the task is, you must take your time to solve it

Sounds reasonable enough. Right? Except, when you are bogged down with work, and now a trivial, but urgent task comes in, forcing you to drop everything and focus on that specific task. Hah, but the task is trivial. What could possibly go wrong? Well, for starters the fact that assumption is a mother of all clusterfucks (pardon my French). So, now, you solved the task half-arsed, passed it back to customer, only to let it hit you right back on your head 30 minutes later. Instead of doing it properly, the first time, you will have to do it the second and hopefully not the third time, taking even more of the time you didn’t have in the first place. Meanwhile, your reputation with your customer is sinking faster than RMS Titanic.

Even in times of stress and distraught, it is important to remember, that each and every task is worth your attention. If naught else, it will save you minutes, if not hours and leave your reputation intact.

 

CSRF feature, known as warning SG0016 is annoying, if you are implementing public API

“New” Visual Studio 2017 comes with abundance of new features. One of them is giving you security recommendations that behave as warnings. Roslyn Security Guard it is called. All fine and dandy. Sadly, though, most of those recommendations are useful only, if you are developing an internal applications. If you are building, let’s say, public Web API, you really don’t want to hear about that CSRF SG0016 warning telling you to validate anti-foregery token. Specially, as all requests are coming from other servers and you have no way to validate that token.

There is a workaround to add

just below class declaration, which suppresses the warning until you do this

I would have still preferred a project option to disable that, though.

 

How to use query string parameters in non-RESTful API

I had to connect to 3rd party non-restful API, that invented all sorts of parameter passing options. From classic JSON for POST requests to combination of route parameters and query string parameters. As I had no access to the API from my development environment, I created a mock API and had to mimic the original API’s behavior.

For route parameters, you simply define a route that knows how to handle them, like so:

If you want to obtain parameter from query string though, you must define [FromUri] in front of it in method declaration:

 

FastDirectoryEnumerator!

A quick task. You need to move 10.000 files from one folder to another.

Solution 1

Use Directory.GetFiles to get a list of all files in directory and then use File.Copy to move them to another location.

Problem with this solution, however, is that although it works fast, it will store all file names into a string array, thus hogging your memory resources like crazy.

Solution 2

Use Directory.EnumerateFiles to get a list of all files in directory and then use File.Copy to move them to another location.

Much better solution as it returns files as IEnumerable<string> which allows you to traverse files before all are loaded.

 

Now imagine that source or destination or both for files that need transfer are on network drive. In that case, first solution will take around 30 seconds to read all files. Second will not fare much better, getting all files read in about 25 seconds. And this on a fast network drive.

Introducing FastDirectoryEnumerator for next solution.

Solution 3

Using FastDirectoryEnumerator.EnumerateFiles, it read 10.000 files in about 20 miliseconds. Yes, that is right. Miliseconds.

You can check documentation and implementation on CodeProject site. The secret is, apparently in not doing a round-trip to the network for each and every file. That and using kernel32.dll.

 

When using integration to move some data, always use separate table

Another project of mine has a bug. Yet to be decided, if it is human or code, but in any case, code should prevent such situations.

This is what happens. The code moves some data from table ITEMS via 3rd party web service to their product. This is done by a column named STATUS in the table ITEMS, which must hold a certain value. The code sets status to “moved to 3rd party service”, prior completion and to “error” in case of execution errors. Upon completion a 3rd party code is written into another field (let’s call it EXT_ID).

Unfortunately, web interface for adding and editing items also uses STATUS field for document workflow. Meaning, it sets status on certain actions.

Lately, this started to happen. An item gets picked, status is set to “moved to 3rd party service” and transfer completes and sets EXT_ID. During this process someone with item opened in browser clicks on “Confirm” button again in web interface and sets status back to “pending for transfer”. Action also removes EXT_ID. As 3rd party service checks for duplicates, it returns a duplication error.

To avoid this, a way better solution would be to create a table ITEMS_TRANSFER. The row would be added to this table (with hash of values), when transfer would be requested and removed (or marked as removed) when transfer completes. This would certainly prevent duplication errors.

What I learned last week at work #3

In a 3 day week, I only managed to learn how to get distinct IP addresses from log file.

How to get distinct IP addresses from log file

For a customer of ours, I had to screen two years of log files and find distinct IP addresses for certain criteria. You could check those log files by hand. Sure, it would take a month or two, but it can be done. However, if you are not keen of spending your day looking at log files line by line, here is what you can do:

  1. You can grep log files for specified criteria:
  2. Then you can parse results to get all IP addresses:
  3. You can then use awk, to print them on separate lines:
  4. And again use awk, to print only distinct ones:
  5. Optionally, you can store output to file:

Ideally, you want to run this in one command:

There you have it! File _ip_addresses.log now contains only distinct IP addresses.

I am pretty sure, it can be done differently. You can leave your solution in comments below.

What I learned last week at work #2

It’s been a quiet week at work. Fixing a bug here and there, implementing minor features, writing some documentation etc etc. Hence, this weeks findings are not programming related.

Without further ado, here is what I learned last week:

  • Windows 10 app restart on unexpected shutdown (or after update restart) cannot be disabled;
  • Solving ‘PkgMgr.exe is deprecated’ error.

Now to details.

 

Windows 10 app restart on unexpected shutdown cannot be disabled

Since Fall Creators update Windows 10 gained an interesting feature. Much like OS X, it restores your applications upon unexpected shutdown or maintenance restarts. Now, I bet this feature sounds great on paper and I bet it is perfect for your everyday user. However, the feature is totally useless and annoying to anyone, doing something more with his/hers computer, besides browsing the internet and watching occasional X rated movie.

Imagine this. At the point of maintenance restart (updates have finished installing), I have 7 Visual Studios 2012 in administrator mode, 5 Visual Studios 2010 (again in administrator mode), 6 Microsoft SQL Management studios, a Notepad++, Outlook, 3 Word documents and 5 Excel worksheets open. I am not even going to count remote desktop sessions and other minor software windows. Now, computer does reboot, it comes back and and I am presented with login prompt. After typing my password 3 times (seriously, I need another password), OS starts loading all windows mentioned above. Except, it opens all Visual studios in normal mode and without opened solutions (thanks for that, btw). Same goes for MS SQL Management Studios. It opens 6 instances, not one having an active connection or at least a correct SQL instance selected. Useless and annoying.

To top it all off, apparently, this feature cannot be turned off and no upgrade to make this available is scheduled to this point.

Solving ‘PkgMgr.exe is deprecated’ error

After a server came crashing, we had to set up a new one. After completed install of server roles and features and our applications, I tried running some of them and got Service unavailable error. I tried to register .NET by issuing

command. This returned another error PkgMgr.exe is deprecated. Quick googling found this page, that explains the cause for the error is missing ASP.NET installation. I went back to server installation and selected ASP.NET 3.5. That solved the problem.

What I learned last week at work

I am a firm believer of a fact, that if you are not learning anything new at your work, it is time to move out of that comfort zone, pack your bags and find a gig where you will. Lately, my work shifted and consists of 99% maintenance grunt work and 1% of actual new development. In that kind of situation, a person can easily forget, that despite chewing the dog food, there is an occasional pickle here and there. So, I created this series. To remind myself, that I am still learning something new and to, hopefully, provide some extra value to whomever stumbles to this place.

So, these are the things I learned in past week:

  1. The verb INTO is not necessary when running INSERT SQL statements on Microsoft SQL Server;
  2. Direct cast of column value of System.Data.DataRow object in .NET 1.1 does not work anymore on Windows Server 2012 and Windows 10;
  3. How to compare strings with fault tolerance;

Now to details.

 

The verb INTO is not necessary when running INSERT SQL statements on Microsoft SQL Server

Debugging for some odd mishap, I have located the following piece of code:

According to SQL standard, verb insert should be followed by verb into. Except it wasn’t. I thought that this has got to be some obsolete code that no-one uses. I’ve checked references and found a few. So that wasn’t it. The code obviously worked, as it exists since 2012. So what the hell?! Well, it turns out, that even though the verb into is mandatory by standard, most implementations (Microsoft SQL server included) ignore this and keep it as optional. I am definitely not adopting this, but it certainly is interesting.

 

Direct cast of System.Data.DataRow column value in .NET 1.1 does not work anymore on Windows Server 2012 and Windows 10

Yes, I know. Microsoft stopped supporting .NET 1.1 framework with Windows 7. Still, we have some projects that run (or more accurately ran) properly even on newer Windows OS. Except that with every update to Windows 10 and Server 2012 it is more and more obvious that .NET 1.1 is getting pushed out.

The latest thing was an InvalidCastException when executing this statement:

where row is of type System.Data.DataRow. One would think that value is not integer, but in this case it was 103, which by my books, is an integer. Interestingly enough, this works:

Go figure.

 

How to compare strings with fault tolerance

In one of our projects, searching by peoples name and surname just wasn’t good enough. Spelling mistakes and different characters in place for unicode ones were supposed to be taken into account.

After 5 minutes of “googling”, I found a StackOverflow answer that suggested using Damerau-Levenshtein distance algorithm. Levenshtein’s distance algorithm provides a way to calculate number of edits that need to be done on one string to get another. Damerau-Levenshtein algorithm is an upgrade that also allows characters to be transposed.

However, this is just the first step. Algorithm provides you with a number of edits. To use it, you still need to define a threshold of how many mistakes will you allow. Fixed values are just not good, if your string length varies. So, I used half of the length of either search query or provided value. It works like a charm.

Quick tip: Optimizing repeating try-catch-finally statement

Lately, I’ve started noticing a pattern in data layer of one of our projects at work. The pattern looks like this:

This repeats itself in just about every data layer method. Lines and lines of useless, repeating code for which I am also to take a lot of blame. So I thought: “There must be a better way than this.”

And there is. I created this method in data layer base class:

This enables me to now change every data layer method to look like this:

This solution has a small issue though. If you are doing insert or update, you might not want to return anything. As you cannot return void, just define returning type to be object and return null. I am prepared to live with this.

 

Web developer: Why 2017 feels exactly like 1997?

I don’t know how many of you, dear readers, remember still how it was like being a web developer in late 90s? You know, the time when not every kid knew how to do web-sites. The time of Geocities, Angelfire and Lycos. The time without Google (well, nearly). The time before cross-browser javascript frameworks and no real support for CSS2. These were the times when 5% of your work presented doing the actual web site and 95% of time was spent on tweaking HTML, CSS and JavaScript to actually work in Netscape 4 and IE 5. And in the end, you somehow always ended doing layout with tables in tables in tables… Yeah. You didn’t want to be THAT guy.

But, with web becoming “the thing” and web sites started to blossom, we got Netscape …uhm… 4 and IE6 and CSS2 support improved (yeah, right) and first semi-useful cross-browser library came to life. It was called cross-browser and surprisingly enough it still online. As funny as it looks today, this was the first library where you didn’t have to pay attention to browser specifics. It gave us at least a glimmer of hope that future is going to be better and bright…

Fast-forward 20 years. Internet Explorer and Netscape are a thing of the past. Chrome, Firefox, Edge and Safari browsers are now in. We have full CSS2 support (well, very nearly) and so many cross-browser javascript libraries that we can’t even name them. Yet, working on my side project TimeLoggerLive, I started to wonder. Is it really that different? I mean, sure, new technologies are out (HTML5, CSS3, Angluar7000 etc.), but has things actually changed for web developers?

CSS3 initial release was in 1996 and HTML5 standard was in preparation since late 2000s. Yet, all these new browsers that we switched to because of promised support for “the latest and greatest” standards still don’t support either in full. Worse even. Like back in 1990s, each and every browser implements the standard differently. You can imagine the confusion.

In case of TimeLoggerLive, my kryptonite is a property contenteditable. The property’s functionality is awesome. By setting it’s value to true, you are supposed to be able to edit content of any HTML tag, provided the property is set on that tag. Handy. Except, it does not work on all tags in IE and Edge browsers, Firefox has some strange behavior, if you use it on empty cell and Chrome, which offers the best implementation of it, for some odd reason, distorts column width.

I checked one of my favourite pages CanIUse.com and it is marked as full support across all browsers, but Opera mini. However, there is a “known issues” section, where it is explained that IE browser does not support contenteditable property in following tags: TABLE, COL, COLGROUP, TBODY, TD, TFOOT, TH, THEAD, and TR. To avoid this, one needs to implement a DIV tag into each table cell. Groovy. Except, when you implement DIV tag, suddenly all browsers start showing border around editable content, which leads to more nasty CSS hacks.

Yes. It feels exactly as 1997.

Quick tip: Setting Oracle client collation

This week one of our clients experienced an interesting problem. Data obtained from ORACLE database did not display unicode characters. They were either replaced by ‘?’ or some other character.

This happens for one of two reasons (or in worst case scneario both). Either your database has wrong collation or your ORACLE client does. The former is a bit difficult to fix, as you will need to change database collation and existing data. The later is a bit easier. Here is how you do it:

ORACLE client 8.x

ORACLE client 11.x

I had to set Slovenian WIN1250 ecnoding and this is what sample does. More languages and options can be found in ORACLE documentation here and here.

Failed to load resources from file. Please check setup

Not so long ago an application written in .NET 1.1 started to pop this error up and about. Funniest thing though, only Windows 10 clients with Creators update installed were affected. Now, we could argue, why there is still an application written in .NET 1.1 and running, but that could be a lengthy debate in which I really don’t want to go into right now. Or ever.

Anyway. The error, as descriptive that it is, means only one thing: somewhere in your code, there is a StackOverflowException. In case you are wondering, no, event logger won’t detect a thing. After much trial and error, I have narrowed the problem down to this chunk of code:

Method GetValueEx returns a response of type object. In this particular case, it should have been a string, but as there are no hits in the database, it returns null. So, basically, the line 3 of method GetValue should have thrown a NullReferenceException, which catch statement should have caught. Except it doesn’t.

I don’t have enough information to explain all details, but on Windows 10 Creators update line 3 throws StackOverflowException, which is for some odd reason not handled by try-catch block. And this causes “Failed to load resources from file. Please check setup” error.

Knowing this, I modified my code to:

Needless to say, the fix works without a glitch. Being a good Samaritan, I have also posted the answer to this StackOverflow question.