最近找到國外一大神做的利用LLVM和Clang寫的修改源代碼的例子,恰好項目需要做一個source-to-source的編譯器與此相關,所以部分翻譯一下這位大神的文章。
在幾年前Eli-Bendersky在博客上寫過一篇關於如何利用Clang製作source-to-source編譯器的文章(文章鏈接) ,在那篇舊的博文裏他寫了一小段如何利用Clang改寫C++源碼的完整代碼,據說那篇文章非常火,但LLVM在這幾年已經取得了很大的進步,所以Eli-Bendersky利用最新的LLVM與Clang重寫了那些代碼。
實例代碼的效果如下:
輸入源文件:
void foo(int* a, int *b) {
if (a[0] > 1) {
b[0] = 2;
}
}
輸出源文件:
// Begin function foo returning void
void foo(int* a, int *b) {
if (a[0] > 1) // the 'if' part
{
b[0] = 2;
}
}
// End function foo
將這個例子的代碼作爲藍本進行改造,就可以很快地做出屬於自己的source-to-source編譯器。
先說說代碼的思路:
1.ASTConsumer負責讀取Clang解析出來的AST樹
2.在ASTConsumer中重寫HandleTopLevelDecl函數用以檢測源碼中的函數聲明語句(見上面效果代碼)
3. RecursiveASTVisitor類負責實際對源碼的改寫
4. 在RecursiveASTVisitor中重寫VisitStmt函數與VisitFunctionDecl函數實現源碼中目標語素的檢測以及改寫動作
5. 改寫好的源碼送入Rewriter類中,進行寫入源代碼文件的動作
代碼:
//------------------------------------------------------------------------------
// Tooling sample. Demonstrates:
//
// * How to write a simple source tool using libTooling.
// * How to use RecursiveASTVisitor to find interesting AST nodes.
// * How to use the Rewriter API to rewrite the source code.
//
// Eli Bendersky ([email protected])
// This code is in the public domain
//------------------------------------------------------------------------------
#include <sstream>
#include <string>
#include "clang/AST/AST.h"
#include "clang/AST/ASTConsumer.h"
#include "clang/AST/RecursiveASTVisitor.h"
#include "clang/Frontend/ASTConsumers.h"
#include "clang/Frontend/FrontendActions.h"
#include "clang/Frontend/CompilerInstance.h"
#include "clang/Tooling/CommonOptionsParser.h"
#include "clang/Tooling/Tooling.h"
#include "clang/Rewrite/Core/Rewriter.h"
#include "llvm/Support/raw_ostream.h"
using namespace clang;
using namespace clang::driver;
using namespace clang::tooling;
static llvm::cl::OptionCategory ToolingSampleCategory("Tooling Sample");
// By implementing RecursiveASTVisitor, we can specify which AST nodes
// we're interested in by overriding relevant methods.
class MyASTVisitor : public RecursiveASTVisitor<MyASTVisitor> {
public:
MyASTVisitor(Rewriter &R) : TheRewriter(R) {}
bool VisitStmt(Stmt *s) {
// Only care about If statements.
if (isa<IfStmt>(s)) {
IfStmt *IfStatement = cast<IfStmt>(s);
Stmt *Then = IfStatement->getThen();
TheRewriter.InsertText(Then->getLocStart(), "// the 'if' part\n", true,
true);
Stmt *Else = IfStatement->getElse();
if (Else)
TheRewriter.InsertText(Else->getLocStart(), "// the 'else' part\n",
true, true);
}
return true;
}
bool VisitFunctionDecl(FunctionDecl *f) {
// Only function definitions (with bodies), not declarations.
if (f->hasBody()) {
Stmt *FuncBody = f->getBody();
// Type name as string
QualType QT = f->getReturnType();
std::string TypeStr = QT.getAsString();
// Function name
DeclarationName DeclName = f->getNameInfo().getName();
std::string FuncName = DeclName.getAsString();
// Add comment before
std::stringstream SSBefore;
SSBefore << "// Begin function " << FuncName << " returning " << TypeStr
<< "\n";
SourceLocation ST = f->getSourceRange().getBegin();
TheRewriter.InsertText(ST, SSBefore.str(), true, true);
// And after
std::stringstream SSAfter;
SSAfter << "\n// End function " << FuncName;
ST = FuncBody->getLocEnd().getLocWithOffset(1);
TheRewriter.InsertText(ST, SSAfter.str(), true, true);
}
return true;
}
private:
Rewriter &TheRewriter;
};
// Implementation of the ASTConsumer interface for reading an AST produced
// by the Clang parser.
class MyASTConsumer : public ASTConsumer {
public:
MyASTConsumer(Rewriter &R) : Visitor(R) {}
// Override the method that gets called for each parsed top-level
// declaration.
bool HandleTopLevelDecl(DeclGroupRef DR) override {
for (DeclGroupRef::iterator b = DR.begin(), e = DR.end(); b != e; ++b) {
// Traverse the declaration using our AST visitor.
Visitor.TraverseDecl(*b);
(*b)->dump();
}
return true;
}
private:
MyASTVisitor Visitor;
};
// For each source file provided to the tool, a new FrontendAction is created.
class MyFrontendAction : public ASTFrontendAction {
public:
MyFrontendAction() {}
void EndSourceFileAction() override {
SourceManager &SM = TheRewriter.getSourceMgr();
llvm::errs() << "** EndSourceFileAction for: "
<< SM.getFileEntryForID(SM.getMainFileID())->getName() << "\n";
// Now emit the rewritten buffer.
TheRewriter.getEditBuffer(SM.getMainFileID()).write(llvm::outs());
}
std::unique_ptr<ASTConsumer> CreateASTConsumer(CompilerInstance &CI,
StringRef file) override {
llvm::errs() << "** Creating AST consumer for: " << file << "\n";
TheRewriter.setSourceMgr(CI.getSourceManager(), CI.getLangOpts());
return llvm::make_unique<MyASTConsumer>(TheRewriter);
}
private:
Rewriter TheRewriter;
};
int main(int argc, const char **argv) {
CommonOptionsParser op(argc, argv, ToolingSampleCategory);
ClangTool Tool(op.getCompilations(), op.getSourcePathList());
// ClangTool::run accepts a FrontendActionFactory, which is then used to
// create new objects implementing the FrontendAction interface. Here we use
// the helper newFrontendActionFactory to create a default factory that will
// return a new MyFrontendAction object every time.
// To further customize this, we could create our own factory class.
return Tool.run(newFrontendActionFactory<MyFrontendAction>().get());
}
Eli-Bendersky在Github上有大量的LLVM示例代碼:
https://github.com/eliben/llvm-clang-samples
上面的代碼可在這裏找到(可能隨時更新):
https://github.com/eliben/llvm-clang-samples/blob/master/src_clang/tooling_sample.cpp
麻雀雖小五臟具全,對上述的代碼進行適當的修改很快就可以寫出符合自己需要的source-to-source編譯器了。