diff options
author | Emilio Cobos Álvarez <ecoal95@gmail.com> | 2016-08-20 22:32:16 -0700 |
---|---|---|
committer | Emilio Cobos Álvarez <ecoal95@gmail.com> | 2016-09-16 11:34:07 -0700 |
commit | cfdf15f5d04d4fbca3e7fcb46a1dd658ade973cd (patch) | |
tree | f7d2087332f4506bb836dce901bc181e5ffc7fba /src/clang.rs | |
parent | bbd6b2c9919e02642a8874e5ceb2ba3b5c76adec (diff) |
Rewrite the core of the binding generator.
TL;DR: The binding generator is a mess as of right now. At first it was funny
(in a "this is challenging" sense) to improve on it, but this is not
sustainable.
The truth is that the current architecture of the binding generator is a huge
pile of hacks, so these few days I've been working on rewriting it with a few
goals.
1) Have the hacks as contained and identified as possible. They're sometimes
needed because how clang exposes the AST, but ideally those hacks are well
identified and don't interact randomly with each others.
As an example, in the current bindgen when scanning the parameters of a
function that references a struct clones all the struct information, then if
the struct name changes (because we mangle it), everything breaks.
2) Support extending the bindgen output without having to deal with clang. The
way I'm aiming to do this is separating completely the parsing stage from
the code generation one, and providing a single id for each item the binding
generator provides.
3) No more random mutation of the internal representation from anywhere. That
means no more Rc<RefCell<T>>, no more random circular references, no more
borrow_state... nothing.
4) No more deduplication of declarations before code generation.
Current bindgen has a stage, called `tag_dup_decl`[1], that takes care of
deduplicating declarations. That's completely buggy, and for C++ it's a
complete mess, since we YOLO modify the world.
I've managed to take rid of this using the clang canonical declaration, and
the definition, to avoid scanning any type/item twice.
5) Code generation should not modify any internal data structure. It can lookup
things, traverse whatever it needs, but not modifying randomly.
6) Each item should have a canonical name, and a single source of mangling
logic, and that should be computed from the inmutable state, at code
generation.
I've put a few canonical_name stuff in the code generation phase, but it's
still not complete, and should change if I implement namespaces.
Improvements pending until this can land:
1) Add support for missing core stuff, mainly generating functions (note that
we parse the signatures for types correctly though), bitfields, generating
C++ methods.
2) Add support for the necessary features that were added to work around some
C++ pitfalls, like opaque types, etc...
3) Add support for the sugar that Manish added recently.
4) Optionally (and I guess this can land without it, because basically nobody
uses it since it's so buggy), bring back namespace support.
These are not completely trivial, but I think I can do them quite easily with
the current architecture.
I'm putting the current state of affairs here as a request for comments... Any
thoughts? Note that there are still a few smells I want to eventually
re-redesign, like the ParseError::Recurse thing, but until that happens I'm
way happier with this kind of architecture.
I'm keeping the old `parser.rs` and `gen.rs` in tree just for reference while I
code, but they will go away.
[1]: https://github.com/Yamakaky/rust-bindgen/blob/master/src/gen.rs#L448
Diffstat (limited to 'src/clang.rs')
-rw-r--r-- | src/clang.rs | 179 |
1 files changed, 151 insertions, 28 deletions
diff --git a/src/clang.rs b/src/clang.rs index f8a68e12..5618007b 100644 --- a/src/clang.rs +++ b/src/clang.rs @@ -15,9 +15,24 @@ pub struct Cursor { x: CXCursor } +impl fmt::Debug for Cursor { + fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result { + write!(fmt, "Cursor({} kind: {}, loc: {})", + self.spelling(), kind_to_str(self.kind()), self.location()) + } +} + pub type CursorVisitor<'s> = for<'a, 'b> FnMut(&'a Cursor, &'b Cursor) -> Enum_CXChildVisitResult + 's; impl Cursor { + pub fn is_declaration(&self) -> bool { + unsafe { clang_isDeclaration(self.kind()) != 0 } + } + + pub fn null() -> Self { + Cursor { x: unsafe { clang_getNullCursor() } } + } + // common pub fn spelling(&self) -> String { unsafe { @@ -43,18 +58,71 @@ impl Cursor { } } + pub fn fallible_semantic_parent(&self) -> Option<Cursor> { + let sp = self.semantic_parent(); + if sp == *self || !sp.is_valid() { + return None; + } + Some(sp) + } + pub fn semantic_parent(&self) -> Cursor { unsafe { Cursor { x: clang_getCursorSemanticParent(self.x) } } } + pub fn num_template_args(&self) -> c_int { + unsafe { + clang_Cursor_getNumTemplateArguments(self.x) + } + } + + + /// This function gets the translation unit cursor. Note that we shouldn't + /// create a TranslationUnit struct here, because bindgen assumes there will + /// only be one of them alive at a time, and dispose it on drop. That can + /// change if this would be required, but I think we can survive fine + /// without it. + pub fn translation_unit(&self) -> Cursor { + assert!(self.is_valid()); + unsafe { + let tu = clang_Cursor_getTranslationUnit(self.x); + let cursor = Cursor { + x: clang_getTranslationUnitCursor(tu), + }; + assert!(cursor.is_valid()); + cursor + } + } + + pub fn is_toplevel(&self) -> bool { + let mut semantic_parent = self.semantic_parent(); + + while semantic_parent.kind() == CXCursor_Namespace || + semantic_parent.kind() == CXCursor_NamespaceAlias || + semantic_parent.kind() == CXCursor_NamespaceRef + { + semantic_parent = semantic_parent.semantic_parent(); + } + + let tu = self.translation_unit(); + // Yes, the second can happen with, e.g., macro definitions. + semantic_parent == tu || semantic_parent == tu.semantic_parent() + } + pub fn kind(&self) -> Enum_CXCursorKind { unsafe { clang_getCursorKind(self.x) } } + pub fn is_anonymous(&self) -> bool { + unsafe { + clang_Cursor_isAnonymous(self.x) != 0 + } + } + pub fn is_template(&self) -> bool { self.specialized().is_valid() } @@ -77,10 +145,11 @@ impl Cursor { } } - pub fn raw_comment(&self) -> String { - unsafe { + pub fn raw_comment(&self) -> Option<String> { + let s = unsafe { String_ { x: clang_Cursor_getRawCommentText(self.x) }.to_string() - } + }; + if s.is_empty() { None } else { Some(s) } } pub fn comment(&self) -> Comment { @@ -165,12 +234,18 @@ impl Cursor { } } - pub fn enum_val(&self) -> i64 { + pub fn enum_val_signed(&self) -> i64 { unsafe { clang_getEnumConstantDeclValue(self.x) as i64 } } + pub fn enum_val_unsigned(&self) -> u64 { + unsafe { + clang_getEnumConstantDeclUnsignedValue(self.x) as u64 + } + } + // typedef pub fn typedef_type(&self) -> Type { unsafe { @@ -195,7 +270,7 @@ impl Cursor { pub fn args(&self) -> Vec<Cursor> { unsafe { let num = self.num_args() as usize; - let mut args = vec!(); + let mut args = vec![]; for i in 0..num { args.push(Cursor { x: clang_Cursor_getArgument(self.x, i as c_uint) }); } @@ -235,6 +310,12 @@ impl Cursor { } } + pub fn method_is_const(&self) -> bool { + unsafe { + clang_CXXMethod_isConst(self.x) != 0 + } + } + pub fn method_is_virtual(&self) -> bool { unsafe { clang_CXXMethod_isVirtual(self.x) != 0 @@ -274,29 +355,40 @@ impl PartialEq for Cursor { clang_equalCursors(self.x, other.x) == 1 } } - - fn ne(&self, other: &Cursor) -> bool { - !self.eq(other) - } } impl Eq for Cursor {} impl Hash for Cursor { fn hash<H: Hasher>(&self, state: &mut H) { - self.x.kind.hash(state); - self.x.xdata.hash(state); - self.x.data[0].hash(state); - self.x.data[1].hash(state); - self.x.data[2].hash(state); + unsafe { clang_hashCursor(self.x) }.hash(state) } } // type +#[derive(Clone, Hash)] pub struct Type { x: CXType } +impl PartialEq for Type { + fn eq(&self, other: &Self) -> bool { + unsafe { + clang_equalTypes(self.x, other.x) != 0 + } + } +} + +impl Eq for Type {} + +impl fmt::Debug for Type { + fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result { + write!(fmt, "Type({}, kind: {}, decl: {:?}, canon: {:?})", + self.spelling(), type_to_str(self.kind()), self.declaration(), + self.declaration().canonical()) + } +} + #[derive(Debug, Copy, Clone, Eq, PartialEq, Hash)] pub enum LayoutError { Invalid, @@ -358,7 +450,7 @@ impl Type { pub fn is_const(&self) -> bool { unsafe { - clang_isConstQualifiedType(self.x) == 1 + clang_isConstQualifiedType(self.x) != 0 } } @@ -378,6 +470,24 @@ impl Type { } } + pub fn fallible_align(&self) -> Result<usize, LayoutError> { + unsafe { + let val = clang_Type_getAlignOf(self.x); + if val < 0 { + Err(LayoutError::from(val as i32)) + } else { + Ok(val as usize) + } + } + } + + pub fn fallible_layout(&self) -> Result<::ir::layout::Layout, LayoutError> { + use ir::layout::Layout; + let size = try!(self.fallible_size()); + let align = try!(self.fallible_align()); + Ok(Layout::new(size, align)) + } + pub fn align(&self) -> usize { unsafe { let val = clang_Type_getAlignOf(self.x); @@ -427,7 +537,7 @@ impl Type { // function pub fn is_variadic(&self) -> bool { unsafe { - clang_isFunctionTypeVariadic(self.x) == 1 + clang_isFunctionTypeVariadic(self.x) != 0 } } @@ -581,21 +691,25 @@ pub struct Index { } impl Index { - pub fn create(pch: bool, diag: bool) -> Index { + pub fn new(pch: bool, diag: bool) -> Index { unsafe { Index { x: clang_createIndex(pch as c_int, diag as c_int) } } } +} - pub fn dispose(&self) { +impl fmt::Debug for Index { + fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result { + write!(fmt, "Index {{ }}") + } +} + +impl Drop for Index { + fn drop(&mut self) { unsafe { clang_disposeIndex(self.x); } } - - pub fn is_null(&self) -> bool { - self.x.is_null() - } } // Token @@ -609,6 +723,12 @@ pub struct TranslationUnit { x: CXTranslationUnit } +impl fmt::Debug for TranslationUnit { + fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result { + write!(fmt, "TranslationUnit {{ }}") + } +} + impl TranslationUnit { pub fn parse(ix: &Index, file: &str, cmd_args: &[String], unsaved: &[UnsavedFile], opts: ::libc::c_uint) -> TranslationUnit { @@ -655,12 +775,6 @@ impl TranslationUnit { } } - pub fn dispose(&self) { - unsafe { - clang_disposeTranslationUnit(self.x); - } - } - pub fn is_null(&self) -> bool { self.x.is_null() } @@ -687,6 +801,15 @@ impl TranslationUnit { } } +impl Drop for TranslationUnit { + fn drop(&mut self) { + unsafe { + clang_disposeTranslationUnit(self.x); + } + } +} + + // Diagnostic pub struct Diagnostic { x: CXDiagnostic |