In Part I, we saw a few examples of image classification. In particular counting objects seemed to be difficult for convolutional neural networks. After sharing my work on the fast.ai forums, I received a few suggestions and requests for further investigation.
The most common were:
Some transforms seemed uneccessary (eg. crop and zoom)
Some transforms might be more useful (eg. vertical flip)
Consider training the model from scratch (inputs come from a different distribution)
Try with more data
Try with different sizes
After regenerating our data we can look at it:
Now we can create a learner and train it on this new dataset.
Which gives us the following output:
Wow! Look at that, this time we’re getting 100% accuracy. It looks like if we throw enough data at it (and use proper transforms) this is a problem that can actually be trivially solved by convolutional neural networks. I honestly did not expect that at all going into this.
Different Sizes of Objects
One drawback of our previous dataset is that the objects we’re counting are all the same size. Is it possible this is making the task too easy? Let’s try creating a dataset with circles of various sizes.
Which allows us to create images that look something like:
Once again we can create a dataset this way and train a convolutional learner on it. Complete code on GitHub.
Still works! Once again I’m surprised. I had very little hope for this problem but these networks seem to have absolutely no issue with solving this.
This runs completely contrary to my expectations. I didn’t think we could count objects by classifying images. I should note that the network isn’t “counting” anything here, it’s simply putting each image into the class it thinks it would belong to. For example, if we showed it an example with 10 images, it would have to classify it as either “45”, “46”, “47”, “48” or “49”.
More generally, counting would probably make more sense as a regression problem than a classification problem. Still, this could be useful when trying to distinguish between object counts of a fixed and guaranteed range.
Over the last year I focused on what some call a “bottom-up” approach to studying deep learning. I reviewed linear algebra and calculus. I read Ian Goodfellow’s book “Deep Learning”. I built AlexNet, VGG and Inception architectures with TensorFlow.
While this approach helped me learn the bits and bytes of deep learning, I often felt too caught up in the details to create anything useful. For example, when reproducing a paper on superconvergence, I built my own ResNet from scratch. Instead of spending time running useful experiments, I found myself debugging my implementation and constantly unsure if I’d made some small mistake. It now looks like I did make some sort of implementation error as the paper was successfully reproduced by fast.ai and integrated into fast.ai’s framework for deep learning.
With all of this weighing on my mind I found it interesting that fast.ai advertised a “top-down” approach to deep learning. Instead of starting with the nuts and bolts of deep learning, they instead first seek to answer the question “How can you make the best/most accurate deep learning system?” and structure their course around this question.
The first lesson focuses on image classification via transfer learning. They provide a pre-trained ResNet-34 network that has learned weights using the ImageNet dataset. This has allowed it to learn various things about the natural world such as the existence of edges, corners, patterns and text.
After creating a competent pet classifier they recommend that students go out and try to use the same approach on a dataset of their own creation. For my part I’ve decided to try their approach on three different datasets, each chosen to be slightly more challenging than the last:
Our first step is simply to import everything that we’ll need from the fastai library:
Next we’ll take a look at the data itself. I’ve saved it in data/paintings. We’ll create an ImageDataBunch which automatically knows how to read labels for our data based off the folder structure. It also automatically creates a validation set for us.
Looking at the above images, it’s fairly easy to differentiate the solid lines of modernism from the soft edges and brush strokes of impressionist paintings. My hope is that this task will be just as easy for a pre-trained neural network that can already recognize edges and identify repeated patterns.
Now that we’ve prepped our dataset, we’ll prepare a learner and let it train for five epochs to get a sense of how well it does.
Looking good! With virtually no effort at all we have a classifier that reaches 95% accuracy. This task proved to be just as easy as expected. In the notebook we take things a further by choosing better learning rate and training for a little while longer before ultimately getting 100% accuracy.
The painting task ended up being as easy as we expected. For our second challenge we’re going to look at a dataset of about 180 cats and 180 kittens. Cats and kittens share many features (fur, whiskers, ears etc.) which seems like it would make this task harder. That said, a human can look at pictures of cats and kittens and easily differentiate between them.
This time our data is located in data/kittencat so we’ll go ahead and load it up.
Once again, let’s try a standard fastai CNN learner and run it for about 5 epochs to get a sense for how it’s doing.
So we’re looking at about 86% accuracy. Not quite the 95% we saw when classifying paintings but perhaps we can push it a little higher by choosing a good learning rate and running our model for longer.
Below we are going to use the “Learning Rate Finder” to (surprise, surprise) find a good learning rate. We’re looking for portions of the plot in which the graph steadily decreased.
It looks like there is a sweetspot between 1e-5 and 1e-3. We’ll shoot for the ‘middle’ and just use 1e-4. We’ll also run for 15 epochs this time to allow more time for learning.
Not bad! With a little bit of learning rate tuning, we were able to get a validation accuracy of about 92% which is much better than I expected considering we had less than 200 examples of each class. I imagine if we collected a larger dataset we could do even better.
For my last task I wanted to see whether or not we could train a ResNet to “count” identical objects. So far we have seen that these networks excel at distinguishing between different objects, but can these networks also identify multiple occurrences of something?
Note: I specifically chose this task because I don’t believe it should be possible for a vanilla ResNet to accomplish this task. A typical convolutional network is set up to differentiate between classes based on the features of those classes, but there is nothing in a convolutional network that suggests to me that it should be able to count objects with identical features.
For this challenge we are going to synthesize our own dataset using matplotlib. We’ll simply generate plots with the correct number of circles in them as shown below:
There are some things to note here:
When we create a dataset like this, we’re in uncharted territory as far as the pre-trained weights are concerned. Our network was trained on photographs of the natural world and expects its inputs to come from this distribution. We’re providing inputs from a completely different distribution (not necessarily a harder one!) so I wouldn’t expect transfer learning to work as flawlessly as it did in previous examples.
Our dataset might be trivially easy to learn. For example, if we wrote an algorithm that simply counted the number of “blue” pixels we could very accurately figure out how many circles were present as all circles are the same size.
We don’t need to hypothesize any further, though. We can just create our ImageDataBunch and pass it to a learner to see how well it does. For now we’ll just use a dataset with 1-5 elements.
Let’s create our learner and see how well it does with the defaults after 3 epochs.
So without any changes we’re sitting at over 85% accuracy. This surprised me as I thought this task would be harder for our neural network as each object it was counting has identical features. If we run this experiment again with a learning rate of 1e-4 and for 15 cycles things get even better:
Wow! We’ve pushed the accuracy up to 99%!
Ugh. This seems wrong to me…
I am not a deep learning pro but every fiber of my being screams out against convolutional networks being THIS GOOD at this task. I specifically chose this task to try to find a failure case! My understanding is that they should be able to identify composite features that occur in an image but there is nothing in there that says they should be able to count (or have any notion of what counting means!)
What I would guess is happening here is that there are certain visual patterns that can only occur for a given number of circles (for example, one circle can never create a line) and that our network uses these features to uniquely identify each class. I’m not sure how to prove this but I have an idea of how we might break it. Maybe we can put so many circles on the screen that the unique patterns will become very hard to find. For example, instead of trying 1-5 circles, let’s try counting images that have 45-50 circles.
After re-generating our data (see Notebook for details) we can visualize it below:
Now we can run our learner against this and see how it does:
Hah! That’s more like it. Now our network can only achieve ~25% accuracy which is slightly better than chance (1 in 5). Playing around with learning rate I was only able to achieve 27% on this task.
This makes more sense to me. There are no “features” in this image that would allow a network to look at it and instantly know how many circles are present. I suspect most humans can also not glance at one of these images and know whether or not there are 45 or 46 elements present. I suspect we would have to fall back to a different approach and manually count them out.
At the end of last year’s retrospective, I set a number of goals for myself. It feels (really) bad to look back and realize that I did complete a single one. I think it’s important to reflect on failures and shortcomings in order to understand them and hopefully overcome them going forward.
Goal 1: Write one blog post every week
Result: 13 posts / 52 weeks
In January 2018 I began the blog series Learn TensorFlow Now which walked users through the very basics of TensorFlow. For three months I stuck to my goal of writing one blog post every week and I’m very proud of how my published posts turned out. Unfortunately during April I took on a consulting project and my posts completely halted. Once I missed a single week I basically gave up on blogging altogether. While I don’t regret taking on a consulting project, I do regret that I used it as an excuse to stop blogging.
This year I would like to start over and try once again to write one blog post per week (off to a rough start considering it’s already the end of January!). I don’t really have a new strategy other than I will resolve not to quit entirely if I miss a week.
When I first started reading this book I was very intimidated by the first few chapters covering the background mathematics of deep learning. While my linear algebra was solid, my calculus was very weak. I put the book away for three months and grinded through Khan Academy’s calculus modules. I say “grinded” because I didn’t enjoy this process at all. Every day felt like a slog and my progress felt painfully slow. Even knowing calculus would ultimately be applicable to deep learning, I struggled to stay focused and interested in the work.
When I came back to the book in the second half of 2018 I realized it was a mistake to stop reading. While the review chapters were mathematically challenging, the actual deep learning portions were much less difficult and most of the insights could be reached without worrying about the math at all. For example, I cannot prove to you that L1 regularization results in sparse weight matrices, but I am aware that such a proof exists (at least in the case of linear regression).
This year I would like to finish this book. I think it might be worth my time to try to implement some of the basic algorithms illustrated in the book without the use of PyTorch or TensorFlow, but that will remain a stretch goal.
Goal 3: Contribute to TensorFlow
Result: 1 Contribution?
In February one of my revised PRs ended up making it into TensorFlow. Since I opened it in December of the previous year I’ve only marked it as half a contribution. Other than this PR I didn’t actively seek out any other places where I could contribute to TensorFlow.
On the plus side, I recently submitted a pull request to PyTorch. It’s a small PR that helps bring the C++ API closer to the Python API. Since it’s not yet merged I guess I should only count this as half a contribution? At least that puts me at one full contribution to deep learning libraries for the year.
Goal 4: Compete in a more Challenging Kaggle competition
Result: 0 attempts
There’s not much to say here other than that I didn’t really seek out or attempt any Kaggle competitions. In the later half of 2018 I began to focus on reinforcement learning so I was interested in other competitive environments such as OpenAI Gym and Halite.io. Unfortunately my RL agents were not very competitive when it came to Halite, but I’m hoping this year I will improve my RL knowledge and be able to submit some results to other competitions.
Goal 5: Work on HackerRank problems to strengthen my interview skills
Result: 3 months / 12 months
While I started off strong and completed lots of problems, I tapered off around the same time I stopped blogging. While I don’t feel super bad about stopping these exercises (I had started working, after all) I am a little sad because it didn’t really feel like I improved at solving questions. This remains an area I want to improve in but I don’t think I’m going to make it an explicit goal in 2019.
Goal 6: Get a job related to ML/AI
Result: 0 jobs
I did not receive (or apply to) any jobs in ML/AI during 2018. After focusing on consulting for most of the year I didn’t feel like I could demonstrate that I was proficient enough to be hired into the field. My understanding is that an end-to-end personal project is probably the best way to demonstrate true proficiency and something I want to pursue during 2019.
Goals for 2019
While I’m obviously not thrilled with my progress in 2018 I try not to consider failure a terminal state. I’m going to regroup and try to be more disciplined and consistent when it comes to my work this year. One activity that I’ve found both fun and productive is streaming on Twitch. I spent about 100 hours streaming and had a pretty consistent schedule during November and December.
Over the last nine posts, we built a reasonably effective digit classifier. Now we’re ready to enter the big leagues and try out our VGGNet on a more challenging image recognition task. CIFAR-10 (Canadian Institute For Advanced Research) is a collection of 60,000 cropped images of planes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks.
50,000 images in the training set
10,000 images in the test set
Size: 32×32 (1024 pixels)
3 Channels (RGB)
10 output classes
CIFAR-10 is a natural next-step due to its similarities to the MNIST dataset. For starters, we have the same number of training images, testing images and output classes. CIFAR-10’s images are of size 32x32 which is convenient as we were paddding MNIST’s images to achieve the same size. These similarities make it easy to use our previous VGGNet architecture to classify these images.
Despite the similarities, there are some differences that make CIFAR-10 a more challenging image recognition problem. For starters, our images are RGB and therefore have 3 channels. Detecting lines might not be so easy when they can be drawn in any color. Another challenge is that our images are now 2-D depictions of 3-D objects. In the above image, the center two images represent the “truck” class, but are shown at different angles. This means our network has to learn enough about “trucks” to recognize them at angles it has never seen before.
The above output shows that we’ve downloaded the dataset and created a training set of size 50,000 and a test set of size 10,000. Note: Unlike MNIST, these labels are not 1-hot encoded (otherwise they’d be of size 50,000x10 and 10,000x10 respectively). We have to account for this difference in shape when we build VGGNet for this dataset.
Let’s start by adjusting input and labels to fit the CIFAR-10 dataset:
Next we have to adjust the first layer of our network. Recall from the post on convolutions that each convolutional filter must match the depth of the layer against which it is convolved. Previously we had defined our convolutional filter to be of shape [3, 3, 1, 64]. That is, a 643x3 convolutional filters, each with depth of 1, matching the depth of our grayscale input image. Now that we’re using RGB images, we must define it to be of shape [3, 3, 3, 64]:
Finally, we must also modify our calculation of correction_prediction (used to calculate accuracy) to account for the change in label shape. We no longer have to take the tf.argmax of our labels because they’re already represented as a single number:
Note: We have to specify output_type=tf.int32 because tf.argmax() returns tf.int64 by default.
With that, we’ve got everything we need to test our VGGNet on CIFAR-10. The complete code is presented at the end of this post.
After running our network for 10,000 steps, we’re greeted with the following output:
Our final test accuracy appears to be approximately 71%, which isn’t too great. On one hand this is disappointing as it means our VGGNet architecture (or the method in which we’re training it) doesn’t generalize very well. On the other hand, CIFAR-10 presents us with new opportunities to try out new neural network components and architectures. In the next few posts we’ll explore some of these approaches to build a neural network that can handle the more complex CIFAR-10 dataset.
If you look carefully at the previous results you may have noticed something interesting. For the first time, our test accuracy (71%) is much lower than our training accuracy (~82-87%). This is a problem we’ll discuss in future posts on bias and variance in deep learning.
2017 saw a lot of change for me. I left Microsoft, returned to Toronto and shifted my focus from developer tools to machine learning. Following in patio11’s footsteps, I wanted to take a moment to reflect on the year and clarify my thoughts in written form.
July 2016 – July 2017
In July 2016, my company Code Connect joined Microsoft with the stated goal of integrating our Alive extension into Visual Studio 2017. Unfortunately our visas were delayed and we weren’t able to begin work until early September. It didn’t make sense to rush Alive into Visual Studio 2017 and risk introducing stability problems so we held off for a few months.
In the meantime, we conducted a number of experiments to try to quantify demand for Alive so we could compare it to other potential features. When the experiments concluded, the data didn’t demonstrate an immediate need for a product like Alive. Alive was put on ice and we worked on features that were immediately pressing to Visual Studio’s success. At the time this meant focusing on accessibility, a top-down directive from Satya Nadella himself.
After Alive was sunset, I did some reflection on where I wanted to take my career and what I wanted to focus on. For a long time I’ve worred that the knowledge I’ve accumulated writing Visual Studio extensions is not immediately applicable outside of Visual Studio. Despite working on developer tools for almost five years, I would be on near-equal footing with a junior developer when it comes to developing extensions for VS Code or Eclipse. Much of my expertise comes in the form of random bits of trivia about Visual Studio. The problems I was faced with were challenging, but not very interesting.
I thought I liked "challenging problems". Lately I've realized there's a big intersection between "challenging" but "uninteresting" problems
The more I complained, the more I realized something had to change. In July 2017 I left Microsoft and applied for a position at a company for which I’d previously been an intern. However I didn’t have the C++ experience required for the roles they were staffing so we concluded it probably wouldn’t be a great fit.
While at Microsoft I began to learn about machine learning in my free time. In 2015 I’d watched in awe from the sidelines as AlphaGo crushed its human counterparts and I wanted in. I decided that now was the time for me to focus exclusively on machine learning and deep learning in particular.
Would I do it again?
Yup. Alive’s best chance for the long-term was a home at Microsoft. We had a modest number of paying customers but required an order of magnitude more in order to grow and continue full-time work on Alive. We had grand dreams of bringing Alive to other languages and we wouldn’t have been able to do so without hiring more developers. Microsoft came to us at the perfect time and gave Alive one last shot at success. I’m sad it didn’t work out, but I’m eternally grateful to everyone involved in getting us to Microsoft and to the Visual Studio editor team for being my home for the year.
August 2017 – Ongoing
The hardest thing about starting something brand new is figuring out where to start. I settled on a mix of linear algebra, Coursera, Kaggle and open-source work.
I’ve completed the first four and am waiting for the final course on Sequence Models to launch in the coming month. These courses paired nicely with Andrej Karpathy’s CS231n videos. This course will be my go-to answer for the question “I want to learn about neural networks, where do I start?”
I wanted to apply the lessons I learned in Andrew’s videos to my own neural networks. I set out to compete in the introductory Kaggle competition “MNIST Digit Recognizer”. As I learned more and more about neural networks I would apply these lessons to my network and watch the score improve. Being told “batch normalization will improve your results” is one thing, but watching your score tick higher is something else altogether. As of this writing my best submission puts me in the top 25% of submissions with 99.171% accuracy.
I set a personal goal to contribute at least one pull request to TensorFlow so as to better understand the tool I was using. Coming from a .NET Desktop background, there was a bit of a learning curve when it came to tools like bazel and Docker. However, like most things in software development these tools just require a bit of time and focused energy to understand.
I’ve seen mixed success with my contributions. My first pull request was a correction to TensorFlow’s implementation of Inception network. The reviewers agreed that the initial model was incorrect, but are hesitant to change the model due to backward compatibility concerns.
My second pull request improved support for various image operations in TensorFlow. In short, it made it easier to augment multiple images at once. (eg. Randomly flipping images left-to-right). Unfortunately, I introduced some performance regressions and my changes had to be reverted. 😢
My third pull request is a re-implementation of the last, while avoiding the performance regressions. It remains open, but I’m confident that after some work it will be accepted.
On the whole, I’m pleased with the progress I made with TensorFlow. The API surface is massive and I have a lot to learn, but I’m making real, measurable progress. I’ll continue to contribute back code where appropriate.
ICLR 2018 Reproducibility Challenge
At the end of each course, Andrew Ng took time to interview famous names in machine learning such as Geoffrey Hinton and Ian Goodfellow. Shared advice they had for newcomers was “Reproduce papers”. Around the same time, I stumbled upon the ICLR 2018 Reproducibility Challenge where students are challenged to reproduce the results of papers submitted to the ICLR conference.
My only regret during 2017 was that I published zero blog posts. As such, this was the first year that traffic to my blog decreased.
I often tell others to start blogging, and this year my actions didn’t match my words. I attribute my poor track-record to one-part laziness and one-part lack of confidence. It’s surprisingly difficult to work up the courage to write about a subject when you’re brand new to it.
Goals for 2018
Follow Jeff Atwood’s advice for bloggers and stick to a schedule for blogging. I want to buckle down and write one blog post a week in 2018
In the last post, we looked at using Roslyn to generate deltas between two compilations. Today we’ll take a look at how we can apply these deltas to a running process.
The CLR API
If you dig through Microsoft’s .NET Reference source occasionally you’ll come across extern methods like FastAllocateString() decorated with a special attribute: [MethodImplAttribute(MethodImplOptions.InternalCall)]. These are entry points to the CLR that can be called from managed code. Calling into the CLR can be done for a number of reasons. In the case of FastAllocateString it’s to implement certain functionality in native code for performance (in this case without even the overhead of P/Invoke). Other entry points are exposed to trigger CLR behavior like garbage collection or to apply deltas to a running process.
When I started this project I wasn’t even aware the CLR had an API. Fortunately Microsoft has recently released internal documentation that explains much of the CLR’s behavior including these APIs. Mscorlib and Calling Into the Runtime documents the differences between FCall, QCall and P/Invoke as entrypoints to the CLR from managed code.
Managing the many methods, classes and interfaces is a huge pain and too much work to do manually when starting out. Luckily Microsoft has released a managed wrapper that makes a lot of this stuff easier to work with. The Managed Debug Sample (mdbg) has everything we’ll need to attach to a process and apply changes to it.
The sample has a few extra projects. For our purposes we’ll need:
corapi – The managed API we’ll interact with directly
raw – Set of interfaces and COMImports over the ICorDebug API
NativeDebugWrappers – Functionality for low level Windows debugging
At a high level, our approach is going to be the following:
Create an instance of CorDebugger, a debugger we can use to create and attach itself to other processes.
Start a remote process
Intercept loading of modules and mark them for Edit and Continue
Creating an instance of the debugger is fairly involved. We first have to get an instance of the CLR host based on the version of the runtime we’re interested in (in our case anything after v4.0 will work). Working with the managed API is still awkward, certain types are created based on GUIDs that seem to be undocumented outside of sample code. Nonetheless the following code creates an instance of a managed debugger we can use.
In the following we get a list of runtimes available from the currently running process. I can’t offer insight into whether this is “good” or “bad” but it’s something to be aware of.
Starting the process
Once we’ve got a hold of our debugger, we can use it to start a process. While working on this I learned that we (in the .NET world) have been shielded from some of the peculiarities of creating a process on Windows. These peculiarities start to bleed through when creating processes with our custom debugger.
For example, if we want to send the argument 123456 to our new process, it turns our we have to send the process’ filename as the first argument as well. So the call to ICorDebug::CreateProcess(string applicationName, string commandLine) ends up looking something like
We also have to manually pass process flags when creating our process. These flags dictate various properties for our new process (Should a new window be created? Should we debug child processes of this process? etc.). Below we start a process, assuming that the application is in the current directory.
Mark Modules for Edit and Continue
By default the CLR doesn’t expect that EnC will be enabled. In order to enable it, we’ll have to manually set JIT flags on each module we’re interested in. CorDebug exposes an event that signals when a module has been loaded, so we’ll use this to control the flags.
A sample event handler for module loading might look like:
Notice in the above that we’re only setting the flag for the module we’re interested in. If we try to set the JIT flags for all modules we’ll run into exceptions when working with NGen-ed modules. The exception is a little cryptic and complains about “Zap Modules” but this turns out just to be the internal name for NGen modules.
Applying the Deltas
Finally. After three blog posts we’ve arrived at the point: Actually manipulating the running process.
In truth, we don’t apply our changes directly to the process, but to an individual module within it. So our first task is to find the individual module we’re want to change. We can search through all AppDomains, assemblies and modules to find the module with the correct name.
Once we find the module we want to request metadata about the module from it. This turns out to be a weird implementation detail in which the CLR assumes you can’t possible want to apply changes unless you’ve requested this info previously. We put this all together into the following:
I should at least touch on one more aspect of EnC I’ve glossed over thus far: remapping. If you are changing a method that has currently active statements, you will be given an opportunity to remap the current “Instruction Pointer” based on line number. It’s up to you to decide on which line execution should resume. The CorDebugger exposes OnFunctionRemapOpportunity and OnFunctionRemapComplete as events that allow you to guide remapping.
Here’s a sample remapping event handler:
We’ve now got all the pieces necessary to manipulate a running process and a good base to build off of. Complete code for today’s blog post can be found here on GitHub. Leave any questions in the comments and I’ll do my best to answer them or direct you to someone who can at Microsoft.
Our first task is to coerce Roslyn to emit metadata and IL deltas between between two compilations. I say coerce because we’ll have to do quite a bit of work to get things working. The Compilation.EmitDifference() API is marked as public, but I’m fairly sure it’s yet to be actually used by the public. Getting everything to work requires reflection and manual copying of Roslyn code that doesn’t ship via NuGet.
The first order of business is to figure out what it takes to call Compilation.EmitDifference() in the first place. What parameters are we expected to provide? The signature:
An EmitBaseline represents a module created from a previous compilation. Modules live inside of assemblies and for our purposes it’s safe to assume that every module relates one-to-one with an assembly. (In reality multi-module assemblies can exist, but neither Visual Studio nor MSBuild support their creation). For more see this StackOverflow question.
We’ll look at the EmitBaseline as representing an assembly created from a previous compilation. We want to create a baseline to represent the initial compiled assembly before any changes are made to it. Roslyn can compare this baseline to new compilations we create.
Func<MethodDefinitionHandle, EditAndContinueMethodDebugInformation> proves much harder to work with. This function maps between methods and various debug information including a method’s local variable slots, lambdas and closures. This information can be generated by reading .pdb symbol files. Unfortunately there’s no public API for generating this function. What’s worse is that we’ll have to use test APIs that don’t even ship via NuGet so even Reflection is out of the question.
Instead we’ll have to piece together bits of code from Roslyn’s test utilities. Ultimately this requires that we copy code from the following files:
It’s a bit of a pain that we need to bring so much of Roslyn with us just for the sake of one file. It’s sort of like working with a ball of yarn; you pull on one string and the whole thing comes with it.
The SymReaderFactory coupled with the DiaSymReader packages can interpret debug information from Microsoft’s PDB format. Once we’ve copied these files to our project we can use the SymReaderFactory to create a debug information provider by feeding the PDB stream to SymReaderFactory.CreateReader().
SemanticEdits describe the differences between compilations at the symbol level. For example, modifying a method will introduce a SemanticEdit for the corresponding IMethodSymbol marking is as updated. Roslyn will end up converting these SemanticEdits into proper IL and metadata deltas.
It turns out SemanticEdit is a public class. The problem is that they’re difficult to generate properly. We have to diff Documents across different versions of a Solution which means we have to take into account changes in syntax, trivia and semantics. We also have to detect invalid changes which aren’t (to my knowledge) officially or completely documented anywhere. In this Roslyn issue, I propose three potential approaches to generating the edits, but we’ll only take a look at the one I’ve implemented myself: using the internal CSharpEditAndContinueAnalyzer.
Since these classes are internal we’ll have to use Reflection to get at them. We’ll also need to keep a copy of the Solution around with which we used to generate our EmitBaseline. I’ve put all of the code together into a complete sample. The reflection based approach for CSharpEditAndContinueAnalyzer is demonstrated in the GetSemanticEdits method below.
We can see that this is quite a bit of work just to build the edits. In the above sample we made a number of simplifying assumptions. We assumed there were no errors in the compilation, that there were no illegal edits and no active statements. It’s important to cover all cases if you plan to consume this API properly.
Our next step will be to apply these deltas to a running process using APIs exposed by the CLR.