Archive-name: C++-faq/part8 Posting-Frequency: monthly Last-modified: Feb 29, 2000 URL: http://marshall-cline.home.att.net/cpp-faq-lite/ AUTHOR: Marshall Cline / cline@parashift.com / 972-931-9470 COPYRIGHT: This posting is part of "C++ FAQ Lite." The entire "C++ FAQ Lite" document is Copyright(C)1991-2000 Marshall Cline, Ph.D., cline@parashift.com. All rights reserved. Copying is permitted only under designated situations. For details, see section [1]. NO WARRANTY: THIS WORK IS PROVIDED ON AN "AS IS" BASIS. THE AUTHOR PROVIDES NO WARRANTY WHATSOEVER, EITHER EXPRESS OR IMPLIED, REGARDING THE WORK, INCLUDING WARRANTIES WITH RESPECT TO ITS MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. C++-FAQ-Lite != C++-FAQ-Book: This document, C++ FAQ Lite, is not the same as the C++ FAQ Book. The book (C++ FAQs, Cline and Lomow, Addison-Wesley) is 500% larger than this document, and is available in bookstores. For details, see section [3]. ============================================================================== SECTION [24]: Inheritance -- private and protected inheritance [24.1] How do you express "private inheritance"? When you use : private instead of : public. E.g., class Foo : private Bar { public: // ... }; ============================================================================== [24.2] How are "private inheritance" and "composition" similar? private inheritance is a syntactic variant of composition (has-a). E.g., the "Car has-a Engine" relationship can be expressed using composition: class Engine { public: Engine(int numCylinders); void start(); // Starts this Engine }; class Car { public: Car() : e_(8) { } // Initializes this Car with 8 cylinders void start() { e_.start(); } // Start this Car by starting its Engine private: Engine e_; // Car has-a Engine }; The same "has-a" relationship can also be expressed using private inheritance: class Car : private Engine { // Car has-a Engine public: Car() : Engine(8) { } // Initializes this Car with 8 cylinders Engine::start; // Start this Car by starting its Engine }; There are several similarities between these two forms of composition: * In both cases there is exactly one Engine member object contained in a Car * In neither case can users (outsiders) convert a Car* to an Engine* There are also several distinctions: * The first form is needed if you want to contain several Engines per Car * The second form can introduce unnecessary multiple inheritance * The second form allows members of Car to convert a Car* to an Engine* * The second form allows access to the protected members of the base class * The second form allows Car to override Engine's virtual[20] functions Note that private inheritance is usually used to gain access into the protected: members of the base class, but this is usually a short-term solution (translation: a band-aid[24.3]). ============================================================================== [24.3] Which should I prefer: composition or private inheritance? Use composition when you can, private inheritance when you have to. Normally you don't want to have access to the internals of too many other classes, and private inheritance gives you some of this extra power (and responsibility). But private inheritance isn't evil; it's just more expensive to maintain, since it increases the probability that someone will change something that will break your code. A legitimate, long-term use for private inheritance is when you want to build a class Fred that uses code in a class Wilma, and the code from class Wilma needs to invoke member functions from your new class, Fred. In this case, Fred calls non-virtuals in Wilma, and Wilma calls (usually pure virtuals[22.4]) in itself, which are overridden by Fred. This would be much harder to do with composition. class Wilma { protected: void fredCallsWilma() { cout << "Wilma::fredCallsWilma()\n"; wilmaCallsFred(); } virtual void wilmaCallsFred() = 0; // A pure virtual function[22.4] }; class Fred : private Wilma { public: void barney() { cout << "Fred::barney()\n"; Wilma::fredCallsWilma(); } protected: virtual void wilmaCallsFred() { cout << "Fred::wilmaCallsFred()\n"; } }; ============================================================================== [24.4] Should I pointer-cast from a private derived class to its base class? Generally, No. From a member function or friend[14] of a privately derived class, the relationship to the base class is known, and the upward conversion from PrivatelyDer* to Base* (or PrivatelyDer& to Base&) is safe; no cast is needed or recommended. However users of PrivatelyDer should avoid this unsafe conversion, since it is based on a private decision of PrivatelyDer, and is subject to change without notice. ============================================================================== [24.5] How is protected inheritance related to private inheritance? Similarities: both allow overriding virtual[20] functions in the private/protected base class, neither claims the derived is a kind-of its base. Dissimilarities: protected inheritance allows derived classes of derived classes to know about the inheritance relationship. Thus your grand kids are effectively exposed to your implementation details. This has both benefits (it allows subclasses of the protected derived class to exploit the relationship to the protected base class) and costs (the protected derived class can't change the relationship without potentially breaking further derived classes). Protected inheritance uses the : protected syntax: class Car : protected Engine { public: // ... }; ============================================================================== [24.6] What are the access rules with private and protected inheritance? Take these classes as examples: class B { /*...*/ }; class D_priv : private B { /*...*/ }; class D_prot : protected B { /*...*/ }; class D_publ : public B { /*...*/ }; class UserClass { B b; /*...*/ }; None of the subclasses can access anything that is private in B. In D_priv, the public and protected parts of B are private. In D_prot, the public and protected parts of B are protected. In D_publ, the public parts of B are public and the protected parts of B are protected (D_publ is-a-kind-of-a B). class UserClass can access only the public parts of B, which "seals off" UserClass from B. To make a public member of B so it is public in D_priv or D_prot, state the name of the member with a B:: prefix. E.g., to make member B::f(int,float) public in D_prot, you would say: class D_prot : protected B { public: using B::f; // Note: Not using B::f(int,float) }; ============================================================================== SECTION [25]: Coding standards [25.1] What are some good C++ coding standards? Thank you for reading this answer rather than just trying to set your own coding standards. But beware that some people on comp.lang.c++ are very sensitive on this issue. Nearly every software engineer has, at some point, been exploited by someone who used coding standards as a "power play." Furthermore some attempts to set C++ coding standards have been made by those who didn't know what they were talking about, so the standards end up being based on what was the state-of-the-art when the standards setters where writing code. Such impositions generate an attitude of mistrust for coding standards. Obviously anyone who asks this question wants to be trained so they don't run off on their own ignorance, but nonetheless posting a question such as this one to comp.lang.c++ tends to generate more heat than light. ============================================================================== [25.2] Are coding standards necessary? Are they sufficient? Coding standards do not make non-OO programmers into OO programmers; only training and experience do that. If coding standards have merit, it is that they discourage the petty fragmentation that occurs when large organizations coordinate the activities of diverse groups of programmers. But you really want more than a coding standard. The structure provided by coding standards gives neophytes one less degree of freedom to worry about, which is good. However pragmatic guidelines should go well beyond pretty-printing standards. Organizations need a consistent philosophy of design and implementation. E.g., strong or weak typing? references or pointers in interfaces? stream I/O or stdio? should C++ code call C code? vice versa? how should ABCs[22.3] be used? should inheritance be used as an implementation technique or as a specification technique? what testing strategy should be employed? inspection strategy? should interfaces uniformly have a get() and/or set() member function for each data member? should interfaces be designed from the outside-in or the inside-out? should errors be handled by try/catch/throw or by return codes? etc. What is needed is a "pseudo standard" for detailed design. I recommend a three-pronged approach to achieving this standardization: training, mentoring[26.1], and libraries. Training provides "intense instruction," mentoring allows OO to be caught rather than just taught, and high quality C++ class libraries provide "long term instruction." There is a thriving commercial market for all three kinds of "training." Advice by organizations who have been through the mill is consistent: Buy, Don't Build. Buy libraries, buy training, buy tools, buy consulting. Companies who have attempted to become a self-taught tool-shop as well as an application/system shop have found success difficult. Few argue that coding standards are "ideal," or even "good," however they are necessary in the kind of organizations/situations described above. The following FAQs provide some basic guidance in conventions and styles. ============================================================================== [25.3] Should our organization determine coding standards from our C experience? No! No matter how vast your C experience, no matter how advanced your C expertise, being a good C programmer does not make you a good C++ programmer. Converting from C to C++ is more than just learning the syntax and semantics of the ++ part of C++. Organizations who want the promise of OO, but who fail to put the "OO" into "OO programming", are fooling themselves; the balance sheet will show their folly. C++ coding standards should be tempered by C++ experts. Asking comp.lang.c++ is a start. Seek out experts who can help guide you away from pitfalls. Get training. Buy libraries and see if "good" libraries pass your coding standards. Do not set standards by yourself unless you have considerable experience in C++. Having no standard is better than having a bad standard, since improper "official" positions "harden" bad brain traces. There is a thriving market for both C++ training and libraries from which to pool expertise. One more thing: whenever something is in demand, the potential for charlatans increases. Look before you leap. Also ask for student-reviews from past companies, since not even expertise makes someone a good communicator. Finally, select a practitioner who can teach, not a full time teacher who has a passing knowledge of the language/paradigm. ============================================================================== [25.4] What's the difference between <xxx> and <xxx.h> headers? [NEW!] [Recently created thanks to Stan Brown (on 1/00).] The headers in ISO Standard C++ don't have a .h suffix. This is something the standards committee changed from former practice. The details are different between headers that existed in C and those that are specific to C++. The C++ standard library is guaranteed to have 18 standard headers from the C language. These headers come in two standard flavors, <cxxx> and <xxx.h> (where xxx is the basename of the header, such as stdio, stdlib, etc). These two flavors are identical except the <cxxx> versions provide their declarations in the std namespace only, and the <xyz.h> versions make them available both in std namespace and in the global namespace. The committee did it this way so that existing C code could continue to be compiled in C++, however the <xyz.h> versions are deprecated, meaning they are standard now but might not be part of the standard in future revisions. (See ISO clause D and subclause D.5 of the ISO C++ standard[6.12].) The C++ standard library is also guaranteed to have 32 additional standard headers that have no direct counterparts in C, such as <iostream>, <string>, and <new>. You may see things like #include <iostream.h> and so on in old code, and some compiler vendors offer .h versions for that reason. But be careful: the .h versions, if available, may differ from the standard versions. And if you compile some units of a program with, for example, <iostream> and others with <iostream.h>, the program may not work. For new projects, use only the <xxx> headers, not the <xxx.h> headers. When modifying or extending existing code that uses the old header names, you should probably follow the practice in that code unless there's some important reason to switch to the standard headers (such as a facility available in standard <iostream> that was not available in the vendor's <iostream.h>). If you need to standardize existing code, make sure to change all C++ headers in all program units including external libraries that get linked in to the final executable. All of this affects the standard headers only. You're free to name your own headers anything you like; see [25.8]. ============================================================================== [25.5] Is the ?: operator evil since it can be used to create unreadable code? No, but as always, remember that readability is one of the most important things. Some people feel the ?: ternary operator should be avoided because they find it confusing at times compared to the good old if statement. In many cases ?: tends to make your code more difficult to read (and therefore you should replace those usages of ?: with if statements), but there are times when the ?: operator is clearer since it can emphasize what's really happening, rather than the fact that there's an if in there somewhere. Let's start with a really simple case. Suppose you need to print the result of a function call. In that case you should put the real goal (printing) at the beginning of the line, and bury the function call within the line since it's relatively incidental (this left-right thing is based on the intuitive notion that most developers think the first thing on a line is the most important thing): // Preferred (emphasizes the major goal -- printing): cout << funct(); // Not as good (emphasizes the minor goal -- a function call): functAndPrintOn(cout); Now let's extend this idea to the ?: operator. Suppose your real goal is to print something, but you need to do some incidental decision logic to figure out what should be printed. Since the printing is the most important thing conceptually, we prefer to put it first on the line, and we prefer to bury the incidental decision logic. In the example code below, variable n represents the number of senders of a message; the message itself is being printed to cout: int n = /*...*/; // number of senders // Preferred (emphasizes the major goal -- printing): cout << "Please get back to " << (n==1 ? "me" : "us") << " soon!\n"; // Not as good (emphasizes the minor goal -- a decision): cout << "Please get back to "; if (n==1) cout << "me"; else cout << "us"; cout << " soon!\n"; All that being said, you can get pretty outrageous and unreadable code ("write only code") using various combinations of ?:, &&, ||, etc. For example, // Preferred (obvious meaning): if (f()) g(); // Not as good (harder to understand): f() && g(); Personally I think the explicit if example is clearer since it emphasizes the major thing that's going on (a decision based on the result of calling f()) rather than the minor thing (calling f()). In other words, the use of if here is good for precisely the same reason that it was bad above: we want to major on the majors and minor on the minors. In any event, don't forget that readability is the goal (at least it's one of the goals). Your goal should not be to avoid certain syntactic constructs such as ?: or && or || or if -- or even goto. If you sink to the level of a "Standards Bigot," you'll ultimately embarass yourself since there are always counterexamples to any syntax-based rule. If on the other hand you emphasize broad goals and guidelines (e.g., "major on the majors," or "put the most important thing first on the line," or even "make sure your code is obvious and readable"), you're usually much better off. Code must be written to be read, not by the compiler, but by another human being. ============================================================================== [25.6] Should I declare locals in the middle of a function or at the top? Declare near first use. An object is initialized (constructed) the moment it is declared. If you don't have enough information to initialize an object until half way down the function, you should create it half way down the function when it can be initialized correctly. Don't initialize it to an "empty" value at the top then "assign" it later. The reason for this is runtime performance. Building an object correctly is faster than building it incorrectly and remodeling it later. Simple examples show a factor of 350% speed hit for simple classes like String. Your mileage may vary; surely the overall system degradation will be less that 350%, but there will be degradation. Unnecessary degradation. A common retort to the above is: "we'll provide set() member functions for every datum in our objects so the cost of construction will be spread out." This is worse than the performance overhead, since now you're introducing a maintenance nightmare. Providing a set() member function for every datum is tantamount to public data: you've exposed your implementation technique to the world. The only thing you've hidden is the physical names of your member objects, but the fact that you're using a List and a String and a float, for example, is open for all to see. Bottom line: Locals should be declared near their first use. Sorry that this isn't familiar to C experts, but new doesn't necessarily mean bad. ============================================================================== [25.7] What source-file-name convention is best? foo.cpp? foo.C? foo.cc? If you already have a convention, use it. If not, consult your compiler to see what the compiler expects. Typical answers are: .C, .cc, .cpp, or .cxx (naturally the .C extension assumes a case-sensitive file system to distinguish .C from .c). We've often used both .cpp for our C++ source files, and we have also used .C. In the latter case, we supply the compiler option forces .c files to be treated as C++ source files (-Tdp for IBM CSet++, -cpp for Zortech C++, -P for Borland C++, etc.) when porting to case-insensitive file systems. None of these approaches have any striking technical superiority to the others; we generally use whichever technique is preferred by our customer (again, these issues are dominated by business considerations, not by technical considerations). ============================================================================== [25.8] What header-file-name convention is best? foo.H? foo.hh? foo.hpp? If you already have a convention, use it. If not, and if you don't need your editor to distinguish between C and C++ files, simply use .h. Otherwise use whatever the editor wants, such as .H, .hh, or .hpp. We've tended to use either .hpp or .h for our C++ header files. ============================================================================== [25.9] Are there any lint-like guidelines for C++? Yes, there are some practices which are generally considered dangerous. However none of these are universally "bad," since situations arise when even the worst of these is needed: * A class Fred's assignment operator should return *this as a Fred& (allows chaining of assignments) * A class with any virtual[20] functions ought to have a virtual destructor[20.4] * A class with any of {destructor, assignment operator, copy constructor} generally needs all 3 * A class Fred's copy constructor and assignment operator should have const in the parameter: respectively Fred::Fred(const Fred&) and Fred& Fred::operator= (const Fred&) * When initializing an object's member objects in the constructor, always use initialization lists rather than assignment. The performance difference for user-defined classes can be substantial (3x!) * Assignment operators should make sure that self assignment[12.1] does nothing, otherwise you may have a disaster[12.2]. In some cases, this may require you to add an explicit test to your assignment operators[12.3]. * In classes that define both += and +, a += b and a = a + b should generally do the same thing; ditto for the other identities of built-in types (e.g., a += 1 and ++a; p[i] and *(p+i); etc). This can be enforced by writing the binary operations using the op= forms. E.g., Fred operator+ (const Fred& a, const Fred& b) { Fred ans = a; ans += b; return ans; } This way the "constructive" binary operators don't even need to be friends[14]. But it is sometimes possible to more efficiently implement common operations (e.g., if class Fred is actually String, and += has to reallocate/copy string memory, it may be better to know the eventual length from the beginning). ============================================================================== [25.10] Which is better: identifier names that_look_like_this or identifier names thatLookLikeThis? It's a precedent thing. If you have a Pascal or Smalltalk background, youProbablySquashNamesTogether like this. If you have an Ada background, You_Probably_Use_A_Large_Number_Of_Underscores like this. If you have a Microsoft Windows background, you probably prefer the "Hungarian" style which means you jkuidsPrefix vndskaIdentifiers ncqWith ksldjfTheir nmdsadType. And then there are the folks with a Unix C background, who abbr evthng n use vry srt idntfr nms. (AND THE FORTRN PRGMRS LIMIT EVRYTH TO SIX LETTRS.) So there is no universal standard. If your organization has a particular coding standard for identifier names, use it. But starting another Jihad over this will create a lot more heat than light. From a business perspective, there are only two things that matter: The code should be generally readable, and everyone in the organization should use the same style. Other than that, th difs r minr. ============================================================================== [25.11] Are there any other sources of coding standards? Yep, there are several. Here are a few sources that you might be able to use as starting points for developing your organization's coding standards: * www.cs.princeton.edu/~dwallach/CPlusPlusStyle.html * [The old URL is <http://www.ses.com/~clarke/conventions/cppconventions_1.html> if anyone knows the new URL, please let me know] * www.oma.com/ottinger/Naming.html * v2ma09.gsfc.nasa.gov/coding_standards.html * fndaub.fnal.gov:8000/standards/standards.html * cliffie.nosc.mil/~NAPDOC/docprj/cppcodingstd/ * www.possibility.com/cpp/ * groucho.gsfc.nasa.gov/Code_520/Code_522/Projects/DRSL/documents/templates/cpp_style_guide.html * www.wildfire.com/~ag/Engineering/Development/C++Style/ * The Ellemtel coding guidelines are available at - [The old URL is <http://web2.airmail.net/~rks/ellhome.htm> if anyone knows the new URL, please let me know] - [The old URL is <http://www.rhi.hi.is/~harri/cpprules.html> if anyone knows the new URL, please let me know] - [The old URL is <http://euagate.eua.ericsson.se/pub/eua/c++> if anyone knows the new URL, please let me know] - [The old URL is <http://nestor.ceid.upatras.gr/programming/ellemtel/ellhome.htm> if anyone knows the new URL, please let me know] - www.doc.ic.ac.uk/lab/cplus/c++.rules/ Note: I do NOT warrant or endorse these URLs and/or their contents. They are listed as a public service only. I haven't checked their details, so I don't know if they'll help you or hurt you. Caveat emptor. ============================================================================== SECTION [26]: Learning OO/C++ [26.1] What is mentoring? It's the most important tool in learning OO. Object-oriented thinking is caught, not just taught. Get cozy with someone who really knows what they're talking about, and try to get inside their head and watch them solve problems. Listen. Learn by emulating. If you're working for a company, get them to bring someone in who can act as a mentor and guide. We've seen gobs and gobs of money wasted by companies who "saved money" by simply buying their employees a book ("Here's a book; read it over the weekend; on Monday you'll be an OO developer"). ============================================================================== [26.2] Should I learn C before I learn OO/C++? Don't bother. If your ultimate goal is to learn OO/C++ and you don't already know C, reading books or taking courses in C will not only waste your time, but it will teach you a bunch of things that you'll explicitly have to un-learn when you finally get back on track and learn OO/C++ (e.g., malloc()[16.3], printf()[15.1], unnecessary use of switch statements[20], error-code exception handling[17], unnecessary use of #define macros[9.3], etc.). If you want to learn OO/C++, learn OO/C++. Taking time out to learn C will waste your time and confuse you. ============================================================================== [26.3] Should I learn Smalltalk before I learn OO/C++? Don't bother. If your ultimate goal is to learn OO/C++ and you don't already know Smalltalk, reading books or taking courses in Smalltalk will not only waste your time, but it will teach you a bunch of things that you'll explicitly have to un-learn when you finally get back on track and learn OO/C++ (e.g., dynamic typing[27.3], non-subtyping inheritance[27.5], error-code exception handling[17], etc.). Knowing a "pure" OO language doesn't make the transition to OO/C++ any easier. This is not a theory; we have trained and mentored literally thousands of software professionals in OO. In fact, Smalltalk experience can make it harder for some people: they need to unlearn some rather deep notions about typing and inheritance in addition to needing to learn new syntax and idioms. This unlearning process is especially painful and slow for those who cling to Smalltalk with religious zeal ("C++ is not like Smalltalk, therefore C++ is evil"). If you want to learn OO/C++, learn OO/C++. Taking time out to learn Smalltalk will waste your time and confuse you. Note: I sit on both the ANSI C++ (X3J16) and ANSI Smalltalk (X3J20) standardization committees[6.11]. I am not a language bigot[6.4]. I'm not saying C++ is better or worse than Smalltalk; I'm simply saying that they are different[27.1]. ============================================================================== [26.4] Should I buy one book, or several? At least two. There are two categories of insight and knowledge in OO programming using C++. You're better off getting a "best of breed" book from each category rather than trying to find a single book that does an OK job at everything. The two OO/C++ programming categories are: * C++ legality guides -- what you can and can't do in C++[26.6]. * C++ morality guides -- what you should and shouldn't do in C++[26.5]. Legality guides describe all language features with roughly the same level of emphasis; morality guides focus on those language features that you will use most often in typical programming tasks. Legality guides tell you how to get a given feature past the compiler; morality guides tell you whether or not to use that feature in the first place. Meta comments: * Neither of these categories is optional. You must have a good grasp of both. * These categories do not trade off against each other. You shouldn't argue in favor of one over the other. They dove-tail. ============================================================================== [26.5] What are some best-of-breed C++ morality guides? Here's my personal (subjective and selective) short-list of must-read C++ morality guides, alphabetically by author: * Cline, Lomow, and Girou, C++ FAQs, Second Edition, 587 pgs, Addison-Wesley, 1999, ISBN 0-201-30983-1. Covers around 500 topics in a FAQ-like Q&A format. * Meyers, Effective C++, Second Edition, 224 pgs, Addison-Wesley, 1998, ISBN 0-201-92488-9. Covers 50 topics in a short essay format. * Meyers, More Effective C++, 336 pgs, Addison-Wesley, 1996, ISBN 0-201-63371-X. Covers 35 topics in a short essay format. Similarities: All three books are extensively illustrated with code examples. All three are excellent, insightful, useful, gold plated books. All three have excellent sales records. Differences: Cline/Lomow/Girou's examples are complete, working programs rather than code fragments or standalone classes. Meyers contains numerous line-drawings that illustrate the points. ============================================================================== [26.6] What are some best-of-breed C++ legality guides? Here's my personal (subjective and selective) short-list of must-read C++ legality guides, alphabetically by author: * Lippman and Lajoie, C++ Primer, Third Edition, 1237 pgs, Addison-Wesley, 1998, ISBN 0-201-82470-1. Very readable/approachable. * Stroustrup, The C++ Programming Language, Third Edition, 911 pgs, Addison-Wesley, 1998, ISBN 0-201-88954-4. Covers a lot of ground. Similarities: Both books are excellent overviews of almost every language feature. I reviewed them for back-to-back issues of C++ Report, and I said that they are both top notch, gold plated, excellent books. Both have excellent sales records. Differences: If you don't know C, Lippman's book is better for you. If you know C and you want to cover a lot of ground quickly, Stroustrup's book is better for you. ============================================================================== [26.7] Are there other OO books that are relevant to OO/C++? Yes! Tons! The morality[26.5] and legality[26.6] categories listed above were for OO programming. The areas of OO analysis and OO design are also relevant, and have their own best-of-breed books. There are tons and tons of good books in these other areas. The seminal book on OO design patterns is (in my personal, subjective and selective, opinion) a must-read book: Gamma et al., Design Patterns, 395 pgs, Addison-Wesley, 1995, ISBN 0-201-63361-2. Describes "patterns" that commonly show up in good OO designs. You must read this book if you intend to do OO design work. ============================================================================== [26.8] But those books are too advanced for me since I've never used any programming language before; is there any hope for me? Yes. There are probably many C++ books that are targeted for people who are brand new programmers, but here's one that I've read: Heller, Who's afraid of C++?, AP Professional, 1996, ISBN 0-12-339097-4. Note that you should supplement that book with one of the above books and/or the FAQ's sections on const correctness[18] and exception safety[17] since these topics aren't highlighted in that book. ============================================================================== SECTION [27]: Learning C++ if you already know Smalltalk [27.1] What's the difference between C++ and Smalltalk? Both fully support the OO paradigm. Neither is categorically and universally "better" than the other[6.4]. But there are differences. The most important differences are: * Static typing vs. dynamic typing[27.2] * Whether inheritance must be used only for subtyping[27.5] * Value vs. reference semantics[28] Note: Many new C++ programmers come from a Smalltalk background. If that's you, this section will tell you the most important things you need know to make your transition. Please don't get the notion that either language is somehow "inferior" or "bad"[6.4], or that this section is promoting one language over the other (I am not a language bigot; I serve on both the ANSI C++ and ANSI Smalltalk standardization committees[6.11]). Instead, this section is designed to help you understand (and embrace!) the differences. ============================================================================== [27.2] What is "static typing," and how is it similar/dissimilar to Smalltalk? Static typing says the compiler checks the type safety of every operation statically (at compile-time), rather than to generate code which will check things at run-time. For example, with static typing, the signature matching for function arguments is checked at compile time, not at run-time. An improper match is flagged as an error by the compiler, not by the run-time system. In OO code, the most common "typing mismatch" is invoking a member function against an object which isn't prepared to handle the operation. E.g., if class Fred has member function f() but not g(), and fred is an instance of class Fred, then fred.f() is legal and fred.g() is illegal. C++ (statically typed) catches the error at compile time, and Smalltalk (dynamically typed) catches the error at run-time. (Technically speaking, C++ is like Pascal --pseudo statically typed-- since pointer casts and unions can be used to violate the typing system; which reminds me: only use pointer casts and unions as often as you use gotos). ============================================================================== [27.3] Which is a better fit for C++: "static typing" or "dynamic typing"? [UPDATED!] [Recently added cross references to evilness of macros (on 3/00).] [For context, please read the previous FAQ[27.2]]. If you want to use C++ most effectively, use it as a statically typed language. C++ is flexible enough that you can (via pointer casts, unions, and #define macros) make it "look" like Smalltalk. But don't. Which reminds me: try to avoid #define: it's evil[9.3], evil[34.1], evil[34.2], evil[34.3]. There are places where pointer casts and unions are necessary and even wholesome, but they should be used carefully and sparingly. A pointer cast tells the compiler to believe you. An incorrect pointer cast might corrupt your heap, scribble into memory owned by other objects, call nonexistent member functions, and cause general failures. It's not a pretty sight. If you avoid these and related constructs, you can make your C++ code both safer and faster, since anything that can be checked at compile time is something that doesn't have to be done at run-time. If you're interested in using a pointer cast, use the new style pointer casts. The most common example of these is to change old-style pointer casts such as (X*)p into new-style dynamic casts such as dynamic_cast<X*>(p), where p is a pointer and X is a type. In addition to dynamic_cast, there is static_cast and const_cast, but dynamic_cast is the one that simulates most of the advantages of dynamic typing (the other is the typeid() construct; for example, typeid(*p).name() will return the name of the type of *p). ============================================================================== [27.4] How do you use inheritance in C++, and is that different from Smalltalk? Some people believe that the purpose of inheritance is code reuse. In C++, this is wrong. Stated plainly, "inheritance is not for code reuse." The purpose of inheritance in C++ is to express interface compliance (subtyping), not to get code reuse. In C++, code reuse usually comes via composition rather than via inheritance. In other words, inheritance is mainly a specification technique rather than an implementation technique. This is a major difference with Smalltalk, where there is only one form of inheritance (C++ provides private inheritance to mean "share the code but don't conform to the interface", and public inheritance to mean "kind-of"). The Smalltalk language proper (as opposed to coding practice) allows you to have the effect of "hiding" an inherited method by providing an override that calls the "does not understand" method. Furthermore Smalltalk allows a conceptual "is-a" relationship to exist apart from the subclassing hierarchy (subtypes don't have to be subclasses; e.g., you can make something that is-a Stack yet doesn't inherit from class Stack). In contrast, C++ is more restrictive about inheritance: there's no way to make a "conceptual is-a" relationship without using inheritance (the C++ work-around is to separate interface from implementation via ABCs[22.3]). The C++ compiler exploits the added semantic information associated with public inheritance to provide static typing. ============================================================================== [27.5] What are the practical consequences of differences in Smalltalk/C++ inheritance? [For context, please read the previous FAQ[27.4]]. Smalltalk lets you make a subtype that isn't a subclass, and allows you to make a subclass that isn't a subtype. This allows Smalltalk programmers to be very carefree in putting data (bits, representation, data structure) into a class (e.g., you might put a linked list into class Stack). After all, if someone wants an array-based-Stack, they don't have to inherit from Stack; they could inherit such a class from Array if desired, even though an ArrayBasedStack is not a kind-of Array! In C++, you can't be nearly as carefree. Only mechanism (member function code), but not representation (data bits) can be overridden in subclasses. Therefore you're usually better off not putting the data structure in a class. This leads to a stronger reliance on abstract base classes[22.3]. I like to think of the difference between an ATV and a Maseratti. An ATV (all terrain vehicle) is more fun, since you can "play around" by driving through fields, streams, sidewalks, and the like. A Maseratti, on the other hand, gets you there faster, but it forces you to stay on the road. My advice to C++ programmers is simple: stay on the road. Even if you're one of those people who like the "expressive freedom" to drive through the bushes, don't do it in C++; it's not a good fit. ============================================================================== SECTION [28]: Reference and value semantics [28.1] What is value and/or reference semantics, and which is best in C++? With reference semantics, assignment is a pointer-copy (i.e., a reference). Value (or "copy") semantics mean assignment copies the value, not just the pointer. C++ gives you the choice: use the assignment operator to copy the value (copy/value semantics), or use a pointer-copy to copy a pointer (reference semantics). C++ allows you to override the assignment operator to do anything your heart desires, however the default (and most common) choice is to copy the value. Pros of reference semantics: flexibility and dynamic binding (you get dynamic binding in C++ only when you pass by pointer or pass by reference, not when you pass by value). Pros of value semantics: speed. "Speed" seems like an odd benefit for a feature that requires an object (vs. a pointer) to be copied, but the fact of the matter is that one usually accesses an object more than one copies the object, so the cost of the occasional copies is (usually) more than offset by the benefit of having an actual object rather than a pointer to an object. There are three cases when you have an actual object as opposed to a pointer to an object: local objects, global/static objects, and fully contained member objects in a class. The most important of these is the last ("composition"). More info about copy-vs-reference semantics is given in the next FAQs. Please read them all to get a balanced perspective. The first few have intentionally been slanted toward value semantics, so if you only read the first few of the following FAQs, you'll get a warped perspective. Assignment has other issues (e.g., shallow vs. deep copy) which are not covered here. ============================================================================== [28.2] What is "virtual data," and how-can / why-would I use it in C++? virtual data allows a derived class to change the exact class of a base class's member object. virtual data isn't strictly "supported" by C++, however it can be simulated in C++. It ain't pretty, but it works. To simulate virtual data in C++, the base class must have a pointer to the member object, and the derived class must provide a new object to be pointed to by the base class's pointer. The base class would also have one or more normal constructors that provide their own referent (again via new), and the base class's destructor would delete the referent. For example, class Stack might have an Array member object (using a pointer), and derived class StretchableStack might override the base class member data from Array to StretchableArray. For this to work, StretchableArray would have to inherit from Array, so Stack would have an Array*. Stack's normal constructors would initialize this Array* with a new Array, but Stack would also have a (possibly protected:) constructor that would accept an Array* from a derived class. StretchableArray's constructor would provide a new StretchableArray to this special constructor. Pros: * Easier implementation of StretchableStack (most of the code is inherited) * Users can pass a StretchableStack as a kind-of Stack Cons: * Adds an extra layer of indirection to access the Array * Adds some extra freestore allocation overhead (both new and delete) * Adds some extra dynamic binding overhead (reason given in next FAQ) In other words, we succeeded at making our job easier as the implementer of StretchableStack, but all our users pay for it[28.5]. Unfortunately the extra overhead was imposed on both users of StretchableStack and on users of Stack. Please read the rest of this section. (You will not get a balanced perspective without the others.) ============================================================================== [28.3] What's the difference between virtual data and dynamic data? The easiest way to see the distinction is by an analogy with virtual functions[20]: A virtual member function means the declaration (signature) must stay the same in subclasses, but the definition (body) can be overridden. The overriddenness of an inherited member function is a static property of the subclass; it doesn't change dynamically throughout the life of any particular object, nor is it possible for distinct objects of the subclass to have distinct definitions of the member function. Now go back and re-read the previous paragraph, but make these substitutions: * "member function" --> "member object" * "signature" --> "type" * "body" --> "exact class" After this, you'll have a working definition of virtual data. Another way to look at this is to distinguish "per-object" member functions from "dynamic" member functions. A "per-object" member function is a member function that is potentially different in any given instance of an object, and could be implemented by burying a function pointer in the object; this pointer could be const, since the pointer will never be changed throughout the object's life. A "dynamic" member function is a member function that will change dynamically over time; this could also be implemented by a function pointer, but the function pointer would not be const. Extending the analogy, this gives us three distinct concepts for data members: * virtual data: the definition (class) of the member object is overridable in subclasses provided its declaration ("type") remains the same, and this overriddenness is a static property of the subclass * per-object-data: any given object of a class can instantiate a different conformal (same type) member object upon initialization (usually a "wrapper" object), and the exact class of the member object is a static property of the object that wraps it * dynamic-data: the member object's exact class can change dynamically over time The reason they all look so much the same is that none of this is "supported" in C++. It's all merely "allowed," and in this case, the mechanism for faking each of these is the same: a pointer to a (probably abstract) base class. In a language that made these "first class" abstraction mechanisms, the difference would be more striking, since they'd each have a different syntactic variant. ============================================================================== [28.4] Should I normally use pointers to freestore allocated objects for my data members, or should I use "composition"? Composition. Your member objects should normally be "contained" in the composite object (but not always; "wrapper" objects are a good example of where you want a pointer/reference; also the N-to-1-uses-a relationship needs something like a pointer/reference). There are three reasons why fully contained member objects ("composition") has better performance than pointers to freestore-allocated member objects: * Extra layer of indirection every time you need to access the member object * Extra freestore allocations (new in constructor, delete in destructor) * Extra dynamic binding (reason given below) ============================================================================== [28.5] What are relative costs of the 3 performance hits associated with allocating member objects from the freestore? The three performance hits are enumerated in the previous FAQ: * By itself, an extra layer of indirection is small potatoes * Freestore allocations can be a performance issue (the performance of the typical implementation of malloc() degrades when there are many allocations; OO software can easily become "freestore bound" unless you're careful) * The extra dynamic binding comes from having a pointer rather than an object. Whenever the C++ compiler can know an object's exact class, virtual[20] function calls can be statically bound, which allows inlining. Inlining allows zillions (would you believe half a dozen :-) optimization opportunities such as procedural integration, register lifetime issues, etc. The C++ compiler can know an object's exact class in three circumstances: local variables, global/static variables, and fully-contained member objects Thus fully-contained member objects allow significant optimizations that wouldn't be possible under the "member objects-by-pointer" approach. This is the main reason that languages which enforce reference-semantics have "inherent" performance challenges. Note: Please read the next three FAQs to get a balanced perspective! ============================================================================== [28.6] Are "inline virtual" member functions ever actually "inlined"? Occasionally... When the object is referenced via a pointer or a reference, a call to a virtual[20] function cannot be inlined, since the call must be resolved dynamically. Reason: the compiler can't know which actual code to call until run-time (i.e., dynamically), since the code may be from a derived class that was created after the caller was compiled. Therefore the only time an inline virtual call can be inlined is when the compiler knows the "exact class" of the object which is the target of the virtual function call. This can happen only when the compiler has an actual object rather than a pointer or reference to an object. I.e., either with a local object, a global/static object, or a fully contained object inside a composite. Note that the difference between inlining and non-inlining is normally much more significant than the difference between a regular function call and a virtual function call. For example, the difference between a regular function call and a virtual function call is often just two extra memory references, but the difference between an inline function and a non-inline function can be as much as an order of magnitude (for zillions of calls to insignificant member functions, loss of inlining virtual functions can result in 25X speed degradation! [Doug Lea, "Customization in C++," proc Usenix C++ 1990]). A practical consequence of this insight: don't get bogged down in the endless debates (or sales tactics!) of compiler/language vendors who compare the cost of a virtual function call on their language/compiler with the same on another language/compiler. Such comparisons are largely meaningless when compared with the ability of the language/compiler to "inline expand" member function calls. I.e., many language implementation vendors make a big stink about how good their dispatch strategy is, but if these implementations don't inline member function calls, the overall system performance would be poor, since it is inlining --not dispatching-- that has the greatest performance impact. Note: Please read the next two FAQs to see the other side of this coin! ============================================================================== [28.7] Sounds like I should never use reference semantics, right? Wrong. Reference semantics are A Good Thing. We can't live without pointers. We just don't want our s/w to be One Gigantic Rats Nest Of Pointers. In C++, you can pick and choose where you want reference semantics (pointers/references) and where you'd like value semantics (where objects physically contain other objects etc). In a large system, there should be a balance. However if you implement absolutely everything as a pointer, you'll get enormous speed hits. Objects near the problem skin are larger than higher level objects. The identity of these "problem space" abstractions is usually more important than their "value." Thus reference semantics should be used for problem-space objects. Note that these problem space objects are normally at a higher level of abstraction than the solution space objects, so the problem space objects normally have a relatively lower frequency of interaction. Therefore C++ gives us an ideal situation: we choose reference semantics for objects that need unique identity or that are too large to copy, and we can choose value semantics for the others. Thus the highest frequency objects will end up with value semantics, since we install flexibility where it doesn't hurt us (only), and we install performance where we need it most! These are some of the many issues the come into play with real OO design. OO/C++ mastery takes time and high quality training. If you want a powerful tool, you've got to invest. Don't stop now! Read the next FAQ too!! ============================================================================== [28.8] Does the poor performance of reference semantics mean I should pass-by-value? Nope. The previous FAQ were talking about member objects, not parameters. Generally, objects that are part of an inheritance hierarchy should be passed by reference or by pointer, not by value, since only then do you get the (desired) dynamic binding (pass-by-value doesn't mix with inheritance, since larger subclass objects get "sliced" when passed by value as a base class object). Unless compelling reasons are given to the contrary, member objects should be by value and parameters should be by reference. The discussion in the previous few FAQs indicates some of the "compelling reasons" for when member objects should be by reference. ============================================================================== -- Marshall Cline / 972-931-9470 / mailto:cline@parashift.com
Закладки на сайте Проследить за страницей |
Created 1996-2024 by Maxim Chirkov Добавить, Поддержать, Вебмастеру |