Our first task is to coerce Roslyn to emit metadata and IL deltas between between two compilations. I say coerce because we’ll have to do quite a bit of work to get things working. The Compilation.EmitDifference()
API is marked as public, but I’m fairly sure it’s yet to be actually used by the public. Getting everything to work requires reflection and manual copying of Roslyn code that doesn’t ship via NuGet.
The first order of business is to figure out what it takes to call Compilation.EmitDifference()
in the first place. What parameters are we expected to provide? The signature:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public EmitDifferenceResult EmitDifference( | |
EmitBaseline baseline, //Input: Information about the baseline compilation | |
IEnumerable<SemanticEdit> edits, //Input: A collection of edits made to the program | |
Stream metadataStream, //Output: Contains the Metadata deltas | |
Stream ilStream, //Output: Contains the IL deltas | |
Stream pdbStream, //Output: Contains the .pdb deltas | |
ICollection<MethodDefinitionHandle> updatedMethods) //Output: that contains methods that changed |
So based on the above, the two input arguments that we need to worry about are EmitBasline
and IEnumerable<SemanticEdit>
. We’ll approach these one at a time.
EmitBaseline
An EmitBaseline represents a module created from a previous compilation. Modules live inside of assemblies and for our purposes it’s safe to assume that every module relates one-to-one with an assembly. (In reality multi-module assemblies can exist, but neither Visual Studio nor MSBuild support their creation). For more see this StackOverflow question.
We’ll look at the EmitBaseline as representing an assembly created from a previous compilation. We want to create a baseline to represent the initial compiled assembly before any changes are made to it. Roslyn can compare this baseline to new compilations we create.
An baseline can be created via EmitBaseline.CreateInitialBaseline()
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public static EmitBaseline CreateInitialBaseline( | |
ModuleMetadata module, | |
Func<MethodDefinitionHandle, EditAndContinueMethodDebugInformation> debugInformationProvider) |
Now we’ve got two more problems: ModuleMetadata
and a function that maps between MethodDefinitionHandle and EditAndContinueMethodDebugInformation.
ModuleMetadata
simply represents summary information about our module/assembly. Thankfully we can create it easily by passing our initial assembly to either ModuleMetadata.CreateFromFile
(for assemblies on disk) or ModuleMetadata.CreateFromStream
(for assemblies in memory).
Func<MethodDefinitionHandle, EditAndContinueMethodDebugInformation>
proves much harder to work with. This function maps between methods and various debug information including a method’s local variable slots, lambdas and closures. This information can be generated by reading .pdb
symbol files. Unfortunately there’s no public API for generating this function. What’s worse is that we’ll have to use test APIs that don’t even ship via NuGet so even Reflection is out of the question.
Instead we’ll have to piece together bits of code from Roslyn’s test utilities. Ultimately this requires that we copy code from the following files:
- ArrayBuilder.cs
- ComStreamWrapper.cs
- CustomDebugInfoReader.cs
- DummyMetadataImport.cs
- ObjectPool.cs
- PdbTestUtilities.cs
- PooledDictionary.cs
- PooledStringBuilder.cs
- StreamExtensions.cs
- SymReaderFactory.cs <- The file we actually need
- SymUnmanagedReaderExtensions.cs
- Token2SourceLineExporter.cs
We’ll also need to include two NuGet packages:
It’s a bit of a pain that we need to bring so much of Roslyn with us just for the sake of one file. It’s sort of like working with a ball of yarn; you pull on one string and the whole thing comes with it.
The SymReaderFactory
coupled with the DiaSymReader packages can interpret debug information from Microsoft’s PDB format. Once we’ve copied these files to our project we can use the SymReaderFactory
to create a debug information provider by feeding the PDB stream to SymReaderFactory.CreateReader()
.
IEnumerable<SemanticEdit>
SemanticEdits
describe the differences between compilations at the symbol level. For example, modifying a method will introduce a SemanticEdit for the corresponding IMethodSymbol
marking is as updated. Roslyn will end up converting these SemanticEdits
into proper IL and metadata deltas.
It turns out SemanticEdit
is a public class. The problem is that they’re difficult to generate properly. We have to diff Documents across different versions of a Solution which means we have to take into account changes in syntax, trivia and semantics. We also have to detect invalid changes which aren’t (to my knowledge) officially or completely documented anywhere. In this Roslyn issue, I propose three potential approaches to generating the edits, but we’ll only take a look at the one I’ve implemented myself: using the internal CSharpEditAndContinueAnalyzer
.
The CSharpEditAndContinueAnalyzer
and its base class method AnalyzeDocumentAsync
will generate a DocumentAnalysisResult
with our edits along with some supplementary information about the changes. Were there errors? Were the changes substantial? Were there special areas of interest such as catch
or finally
blocks?
Since these classes are internal we’ll have to use Reflection to get at them. We’ll also need to keep a copy of the Solution
around with which we used to generate our EmitBaseline
. I’ve put all of the code together into a complete sample. The reflection based approach for CSharpEditAndContinueAnalyzer
is demonstrated in the GetSemanticEdits
method below.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
static void FullWal() | |
{ | |
string sourceText_1 = @" | |
using System; | |
using System.Threading.Tasks; | |
class C | |
{ | |
public static void F() { Console.WriteLine(""Original Text""); } | |
public static void Main() { F(); Console.ReadLine(); } | |
}"; | |
string sourceText_2 = @" | |
using System; | |
using System.Threading.Tasks; | |
class C | |
{ | |
public static void F() { Console.WriteLine(123456789); } | |
public static void Main() { F(); Console.ReadLine(); } | |
}"; | |
string programName = "MyProgram.exe"; | |
string pdbName = "MyProgram.pdb"; | |
//Get solution | |
Solution solution = createSolution(sourceText_1); | |
//Get compilation | |
var compilation = solution.Projects.Single().GetCompilationAsync().Result; | |
//Emit .exe. and .pdb to disk | |
var emitResult = compilation.Emit(programName, pdbName); | |
if (!emitResult.Success) | |
{ | |
throw new InvalidOperationException("Errors in compilation: " + emitResult.Diagnostics.Count()); | |
} | |
//Build the EmitBaseline | |
var metadataModule = ModuleMetadata.CreateFromFile(programName); | |
var fs = new FileStream(pdbName, FileMode.Open); | |
var emitBaseline = EmitBaseline.CreateInitialBaseline(metadataModule, SymReaderFactory.CreateReader(fs).GetEncMethodDebugInfo); | |
//Take solution, change it and compile it | |
var document = solution.Projects.Single().Documents.Single(); | |
var updatedDocument = document.WithText(SourceText.From(sourceText_2, System.Text.Encoding.UTF8)); | |
var newCompilation = updatedDocument.Project.GetCompilationAsync().Result; | |
//Get semantic edits with Reflection + CSharpEditAndContinueAnalyzer | |
IEnumerable<SemanticEdit> semanticEdits = GetSemanticEdits(solution, updatedDocument); | |
//Apply metadat/IL deltas | |
var metadataStream = new MemoryStream(); | |
var ilStream = new MemoryStream(); | |
var newPdbStream = new MemoryStream(); | |
var updatedMethods = new List<System.Reflection.Metadata.MethodDefinitionHandle>(); | |
var newEmitResult = newCompilation.EmitDifference(emitBaseline, semanticEdits, metadataStream, ilStream, newPdbStream, updatedMethods); | |
} | |
private static IEnumerable<SemanticEdit> GetSemanticEdits(Solution originalSolution, Document updatedDocument, CancellationToken token = default(CancellationToken)) | |
{ | |
//Load our CSharpAnalyzer and ActiveStatementSpan types via reflection | |
Type csharpEditAndContinueAnalyzerType = Type.GetType("Microsoft.CodeAnalysis.CSharp.EditAndContinue.CSharpEditAndContinueAnalyzer, Microsoft.CodeAnalysis.CSharp.Features"); | |
Type activeStatementSpanType = Type.GetType("Microsoft.CodeAnalysis.EditAndContinue.ActiveStatementSpan, Microsoft.CodeAnalysis.Features"); | |
dynamic csharpEditAndContinueAnalyzer = Activator.CreateInstance(csharpEditAndContinueAnalyzerType, nonPublic: true); | |
var bindingFlags = BindingFlags.Instance | BindingFlags.Static | BindingFlags.Public; | |
Type[] targetParams = new Type[] { }; | |
//Create an empty ImmutableArray<ActiveStatementSpan> because we're not currently running the code | |
var immutableArray_Create_T = typeof(ImmutableArray).GetMethod("Create", bindingFlags, binder: null, types: targetParams, modifiers: null); | |
var immutableArray_Create_ActiveStatementSpan = immutableArray_Create_T.MakeGenericMethod(activeStatementSpanType); | |
var immutableArray_ActiveStatementSpan = immutableArray_Create_ActiveStatementSpan.Invoke(null, new object[] { }); | |
var method = (MethodInfo)csharpEditAndContinueAnalyzer.GetType().GetMethod("AnalyzeDocumentAsync"); | |
var myParams = new object[] { originalSolution, immutableArray_ActiveStatementSpan, updatedDocument, token }; | |
object task = method.Invoke(csharpEditAndContinueAnalyzer, myParams); | |
var documentAnalysisResults = task.GetType().GetProperty("Result").GetValue(task); | |
//Get the semantic edits from DocumentAnalysisResults | |
var edits = (IEnumerable<SemanticEdit>)documentAnalysisResults.GetType().GetField("SemanticEdits", bindingFlags).GetValue(documentAnalysisResults); | |
return edits; | |
} | |
private static Solution createSolution(string text) | |
{ | |
var tree = CSharpSyntaxTree.ParseText(text); | |
var mscorlib = MetadataReference.CreateFromFile(typeof(object).Assembly.Location); | |
var adHockWorkspace = new AdhocWorkspace(); | |
var options = new CSharpCompilationOptions(OutputKind.ConsoleApplication, platform: Platform.X86); | |
var project = adHockWorkspace.AddProject(ProjectInfo.Create(ProjectId.CreateNewId(), VersionStamp.Default, "MyProject", "MyProject", "C#", metadataReferences: new List<MetadataReference>() { mscorlib }, compilationOptions: options)); | |
adHockWorkspace.AddDocument(project.Id, "MyDocument.cs", SourceText.From(text, System.Text.UTF8Encoding.UTF8)); | |
return adHockWorkspace.CurrentSolution; | |
} |
We can see that this is quite a bit of work just to build the edits. In the above sample we made a number of simplifying assumptions. We assumed there were no errors in the compilation, that there were no illegal edits and no active statements. It’s important to cover all cases if you plan to consume this API properly.
Our next step will be to apply these deltas to a running process using APIs exposed by the CLR.