20 Comments

I believe a likely explanation for why the interviewer wanted you to write it that way is because that is considered the "idiomatic" way.

If you have a copy of "The C Programming Language" 2nd edition handy, flip to page 106 (or look up strcpy in the index). It says "Although this may seem cryptic at first, the notational convenience is considerable, and the idiom should be mastered, because you will see it frequently in C programs."

Whether it is the most readable is, of course, debatable, but hopes of better compiler output may not have been the reason the interviewer wanted you to write it this way.

Expand full comment
author

That would be more unfortunate. I was assuming that the person at least had some concrete reason why they would suggest to someone that they change how they wrote a perfectly valid loop. If it was literally just because they read it in some book, that is such a terrible reason!

- Casey

Expand full comment
Aug 4, 2023·edited Aug 4, 2023

Notwithstanding that thinking for yourself about the reason of your actions is clearly superior to just doing what you read somewhere, this is not just some book, but basically the manual for the C language by the language author, Dennis Ritchie (and Brian Kernighan). I can understand why the things written in there carry some weight.

Expand full comment

I agree! Also Im super against the attitude of writing things in clever and terse ways just to save a few character on the screen. The absolute worst case of this type of cleverness I have seen is in the Actua Soccer 96 you can find on Github. It contains beautiful lines like this: fc=(float)(rz[j]-SCREENDIST)/(rz[j]-rz[k]);

tpts2[pt]=rx[j]+(fc*(rx[k]-rx[j]));

ttex2[pt++]=grtexx[j]+(fc*(grtexx[k]-grtexx[j]));

tpts2[pt]=ry[j]+(fc*(ry[k]-ry[j]));

ttex2[pt++]=grtexy[j]+(fc*(grtexy[k]-grtexy[j]));

tpts2[pt++]=SCREENDIST;

Expand full comment

I also remember seeing the `while` version in C/C++ books in the 90s.

Expand full comment

Maybe the interviewer wanted to check whether the potential intern knew about pointer arithmetic? That you can, in fact, do something like *From++

That, and of course terseness for terseness's sake.

Expand full comment
author

It's possible, but they should have just asked that question. It's very confusing for someone to be asked how to do a string copy, then have them do it, and then have the person not seem to think that it is "done".

- Casey

Expand full comment

The reason the interviewer wanted the second implementation probably wasn't that he thought it was more performant. He just wanted to check that you knew your stuff about pointer arithmetic (like @Yakvi mentioned) and pointer dereferencing.

Expand full comment
Sep 12, 2023·edited Sep 12, 2023

I'm a bit late to the party, just catching up on Computer Enhance now!

Is there a reason you are suggesting lodsb, stosb for the "correct" way, rather than movs and rep?

My takeaway when looking through the manual back when doing part 1 was that movs was what you would use for string copies. It's a little hard to be sure I am understanding the manual correctly but it appears that doing:

rep movsb

Would be 9 +17/loop. Which is either 25 or 16 cycles less than your 42 cycles depending on how you count it / how long the string is.

Expand full comment

What's really messing me up is the part around 9:00 where you show a debugger from ~1994 that has a better interface than the typical 2023 Linux standard of GDB. Are you able to switch to a source code view instead of an assembly-level one there?

Expand full comment
author

Yes, of course. It had source-level debugging as the default, I have to specifically ask for that ASM window.

- Casey

Expand full comment

This is probably just a noob C question but is the compiler who adds this null copy check for us? For a moment I thought those for and while loops where going to go out of bounds.

Expand full comment
author

It is just how C is specified. The result of an ='s expression is the value of the expression, so if it appears in a conditional (like a for or while loop), then it will evaluate to FALSE if it is 0 and TRUE if it is non-zero.

So, for example, "a = 5" both assigns 5 to the variable "a" *and* evaluates to 5, which can then be tested, and will be "true" because it is non-zero.

- Casey

Expand full comment

Do you happen to have unoptimized assembly output for those functions?

Expand full comment
author

I didn't grab that, but I definitely can. It's easy to re-run Borland C++ with the optimizer off.

- Casey

Expand full comment

It surprises me that the compiler would produce such a bad output for the 2nd example. They are effectively the same, and in fact it should have been more obvious for the compiler than the 1st (which could have made the mistake to put i in bx and increase it separately and then add it to si,di, if it didn't cleverly deduce both brackets are pointers that can be increased individually). I am surprised it failed in the 2nd code but did cleverly the 1st.

That's why I am not bothered as much with gimmicky optimizations like pre-increment instead of post-increment, do while instead of for loop, bracket index or pointers increment, and others as many times the results are unpredictable and also depend per compiler or CPU architecture. Or at least I test variations before making conclusions sometimes when I need to get that extra for a specific compiler-architecture. Many have told me "you should use this instead of that because it's faster for the compiler", but later when I check things like that, the results are unpredictable, in one tight loop I am writing it's better, in another it's worse, or the same. Compilers really are like chat-gpt, people put too much faith in them to automatically optimize things, but you also have to try various things to get that little bit extra, with varying results.

Expand full comment
author

That is my opinion as well (assuming I understand you correctly). In my experience you have to check the compiled code if you care. You can never assume that there is a "correct way to write this in C" otherwise, because it can change between compilers, and even between two versions of the same compiler.

I would like to know how Microsoft's compiler handled these, and also how Borland's does if we switch to 80386 codegen, etc. So if I have time this weekend, I'm going to try to get a better emulation setup together so I can run more tests and get more ASM dumps!

- Casey

Expand full comment

Interestingly, MSVC is the only modern compiler who don’t create the same code for both functions (I checked on godbolt). I think it also create an inferior code compared to the other, but I may be wrong: I joined the course this week and I am not an assembly expert. I checked because the Intel compiler was calling their optimised memcpy in the previous interview question and I was curious if it identifies the strcpy in this one.

Expand full comment

I had no idea they had full GUI IDEs back in the DOS days, figured it'd all be command line. That's super cool!

Expand full comment
author
Aug 4, 2023·edited Aug 4, 2023Author

Microsoft shipped one as well! It was called "Programmer's Workbench" and came with Microsoft C. I would have shown it in the video but I haven't been able to get the compiler to run under DOSBox, so I think I may have to create a virtual machine to get it working.

- Casey

Expand full comment