Sunday, March 27, 2011

What does strpbrk stand for?

I've used strpbrk() occasionally while doing low-level string work in C, but I've never been able to figure out what it stands for. I've always pronounced it internally in my head as "stir p bark", but that's never quite felt right.

It doesn't have an etymology as obvious as any of the other string functions, e.g. strchr (string cha**r**acter) or strspn (string spa**n**).

I vaguely recall reading somewhere that all of the original standard library functions were limited to 7-character names either to remain compatible with Fortran, or because in the original C standard, identifiers longer than 7 characters were considered equivalent to their 7-character prefix or something. Can anyone confirm/deny/clarify this?

From stackoverflow
  • I vageuly remember that strpbrk stands for String Pointer Break, I don't remember where I saw it.

  • The following conversation suggests "String Pointer Break": http://www.cpptalk.net/what-strspn-and-strpbrk-stands-for--vt1253.html

    Indeed the Microsoft documentation case is StrPBrk and would tend to confirm that split of words: http://msdn.microsoft.com/en-us/library/bb760010(VS.85).aspx

    And finally this confirms it: http://www.gnu.org/software/libtool/manual/libc/Search-Functions.html

    The strpbrk (“string pointer break”) function is related to strcspn, except that it returns a pointer to the first character in string that is a member of the set stopset instead of the length of the initial substring. It returns a null pointer if no such character from stopset is found.

    bobobobo : Ow - breaking a pointer sounds painful
  • From the The GNU C Library documentation:

    Function: char * strpbrk (const char *string, const char *stopset)

    The strpbrk ("string pointer break") function is related to strcspn, except that it returns a pointer to the first character in string that is a member of the set stopset instead of the length of the initial substring. It returns a null pointer if no such character from stopset is found.

    For example,

    strpbrk ("hello, world", " \t\n,.;!?") => ", world"

    The function returns a pointer to the first character in the STOPSET (aka BREAKSET). I mentally read it as " for STRing, return Pointer to BRea**K** ".

  • As for your other question:

    I vaguely recall reading somewhere that all of the original standard library functions were limited to 7-character names either to remain compatible with Fortran, or because in the original C standard, identifiers longer than 7 characters were considered equivalent to their 7-character prefix or something. Can anyone confirm/deny/clarify this?

    The original ANSI C Standard said it was implementation defined how many characters of an external symbol would be significant, but that it had to be at least 6 characters (and the implementation was permitted to be insensitive to case for them). This was done because way back when, systems often had this type of limitation (whether it came from FORTRAN libraries, linker limitations or whatever).

    So while you'll see external names longer than that in the standard library, no two of those names start with the same 6 character sequence.

    FWIW, C99 bumped the minimum up to 31 characters. The C++ standard (1998) says that the implementation must document how much of an external name is significant, and suggests that it be at least 1024 characters. I know that Borland C++ 5.5 had a limitation of something like 250 characters which causes problems particularly with using templates.

    A. Rex : Thanks for answering the other half of the question, which I personally found interesting.

0 comments:

Post a Comment