### Question

Though the common sense and literature is clear about the behaviour of `strcmp()`

:

```
int strcmp( const char *lhs, const char *rhs );
```

Negative value if `lhs`

appears before `rhs`

in lexicographical order.

Zero if `lhs`

and `rhs`

compare equal.

Positive value if `lhs`

appears after `rhs`

in lexicographical order.

I can't seem to make it return any values other than `-1`

, `0`

and `1`

.

**Sure it is true that the behaviour is consistent with the definition** but I was expecting values bigger or smaller than `1`

or `-1`

since the definition asserts that results will be `<0`

, `0`

or `>0`

, not `-1`

, `0`

or `1`

.

I tested this in several compilers and libraries with the same results. I would like to see an example where that's not the case.

sample code

```
#include <stdio.h>
#include <string.h>
int main()
{
printf("%d ", strcmp("a", "a"));
printf("%d ", strcmp("abc", "aaioioa"));
printf("%d ", strcmp("eer", "tsdf"));
printf("%d ", strcmp("cdac", "cdac"));
printf("%d ", strcmp("zsdvfgh", "ertgthhgj"));
printf("%d ", strcmp("abcdfg", "rthyuk"));
printf("%d ", strcmp("ze34", "ze34"));
printf("%d ", strcmp("er45\n", "io\nioa"));
printf("%d", strcmp("jhgjgh", "cdgffd"));
}
```

`Result: 0 1 -1 0 1 -1 0 -1 1`

### Answer 1:

The C standard clearly says (C11 §7.24.4.2 The strcmp function):

The strcmp function returns an integer greater than, equal to, or less than zero, accordingly as the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2.

It doesn't say how much greater than or less than zero the result must be; a function that always returns `-1`

, `0`

or `+1`

meets the standard; so does a function that sometimes returns values with a magnitude larger than `1`

, such as `-27`

, `0`

, `+35`

. If your code is to conform to the C standard, it must not assume either set of results; it may only assume that the sign of the result is correct.

Here is an implementation of `strcmp()`

— named `str_cmp()`

here so that the result can be compared with `strcmp()`

— which does not return `-1`

or `+1`

:

```
#include <string.h>
#include <stdio.h>
static int str_cmp(const char *s1, const char *s2)
{
while (*s1 == *s2 && *s1 != '\0')
s1++, s2++;
int c1 = (int)(unsigned char)*s1;
int c2 = (int)(unsigned char)*s2;
return (c1 - c2);
}
int main(void)
{
printf("%d ", strcmp("a", "a"));
printf("%d ", strcmp("abc", "aAioioa"));
printf("%d\n", strcmp("eer", "tsdf"));
printf("%d ", str_cmp("a", "a"));
printf("%d ", str_cmp("abc", "aAioioa"));
printf("%d\n", str_cmp("eer", "tsdf"));
return 0;
}
```

When run on a Mac (macOS Mojave 10.14.6; GCC 9.2.0; Xcode 11.13.1), I get the output:

```
0 1 -1
0 33 -15
```

I did change your data slightly — `"aaioioa"`

became `"aAioioa"`

. The overall result is no different (but the value 33 is bigger than you'd get with the original string) — the return value is less than, equal to, or greater than zero as required.

The `str_cmp()`

function is a legitimate implementation and is loosely based on a historically common implementation of `strcmp()`

. It has slightly more care in the return value, but you can find two minor variants of it on p106 of Brian W Kernighan and Dennis M Ritchie
The C Programming Language, 2nd Edn (1988) — one using array indexing, the other using pointers:

```
int strcmp(char *s, char *t)
{
int i;
for (i = 0; s[i] == t[i]; i++)
if (s[i] == '\0')
return 0;
return s[i] - t[i];
}
int strcmp(char *s, char *t)
{
for ( ; *s == *t; s++, t++)
if (*s == '\0')
return 0;
return *s - *t;
}
```

The K&R code might not return the expected result if the plain `char`

type is signed and if one of the strings contains 'accented characters', characters from the range -128 .. -1 (or 0x80 .. 0xFF when viewed as unsigned values). The casting in my `str_cmp()`

code treats the data as `unsigned char`

(via the cast); the `(int)`

cast isn't really necessary because of the assignments. The subtraction of two `unsigned char`

values converted to `int`

produces a result in the range `-255`

.. `+255`

. However, modern versions of the C library don't use the direct subtraction like that if they return only `-1`

, `0`

or `+1`

.

Note that the C11 standard §7.24.4 String comparison functions says:

The sign of a nonzero value returned by the comparison functions `memcmp`

, `strcmp`

, and `strncmp`

is determined by the sign of the difference between the values of the first pair of characters (both interpreted as `unsigned char`

) that differ in the objects being compared.

You can look at How do I check if a value matches a string?. The outline there shows:

```
if (strcmp(first, second) == 0) // first equal to second
if (strcmp(first, second) <= 0) // first less than or equal to second
if (strcmp(first, second) < 0) // first less than second
if (strcmp(first, second) >= 0) // first greater than or equal to second
if (strcmp(first, second) > 0) // first greater than second
if (strcmp(first, second) != 0) // first unequal to second
```

Note how comparing to zero uses the same comparison operator as the test you're making.

You could (but probably shouldn't) write:

```
if (strcmp(first, second) <= -1) // first less than second
if (strcmp(first, second) >= +1) // first greater than second
```

You'd still get the same results, but it is not sensible to do so; always comparing with zero is easier and more uniform.

You can get a -1, 0, +1 result using:

```
unsigned char c1 = *s1;
unsigned char c2 = *s2;
return (c1 > c2) - (c1 < c2);
```

For unrestricted integers (rather than integers restricted to 0 .. 255), this is safe because it avoids integer overflows whereas subtraction gives the wrong result. For the restricted integers involved with 8-bit characters, overflow on subtraction is not an issue.

### Answer 2:

The specification says that the numbers have to be negative, zero or positive, but it doesn't lock down the exact value necessary. The library itself may behave in more specific ways.

The spec means that code like this is technically invalid:

```
if (strcmp(a, b) == 1)
```

This may "work on my machine" but not someone else's who uses a different library.

Where what you should be writing is:

```
if (strcmp(a, b) > 0)
```

That's all it really means: *expect* values other than just 1/-1 and code accordingly.

### Answer 3:

Please re-read this bit

Negative value if lhs appears before rhs in lexicographical order.

Is -1 sufficient for this statement to be true?

Zero if lhs and rhs compare equal.

Positive value if lhs appears after rhs in lexicographical order.

Is 1 sufficient for this statement to be true?

So the sample code is acting as per spec.

**EDIT**

Just test the return value for zero, less than zero or more than zero. As per spec this should work in all implementations.

**EDIT 2**

I think this will fulfull the spec - have not tested :-(

```
for (size_t i = 0; s1[i] && s2[i] &&s1[i] == s2[i]; ++i) {
// Empty
}
return s2[i] - s1[i]; // This may be the wrong way around
```

This will return values other that 1, -1 or 0.

### Answer 4:

Here are a few examples of C libraries with `strcmp()`

implementations that do not always return `-1`

, `0`

or `+1`

:

The **Bionic libc** has a BSD based implementation of `strcmp()`

:

```
int
strcmp(const char *s1, const char *s2)
{
while (*s1 == *s2++)
if (*s1++ == 0)
return (0);
return (*(unsigned char *)s1 - *(unsigned char *)--s2);
}
```

The **Dietlibc** does the same. It is even non conforming version if configured for `WANT_SMALL_STRING_ROUTINES`

:

```
int
strcmp (const char *s1, const char *s2)
{
#ifdef WANT_SMALL_STRING_ROUTINES
while (*s1 && *s1 == *s2)
s1++, s2++;
return (*s1 - *s2);
#else
// a more advanced, conforming implementation that tests multiple characters
// at a time but still return the difference of characters as unsigned bytes
#endif
}
```

**Glibc** has this implementation of `strcmp`

in its `generic`

directory, used for exotic architectures:

```
int
strcmp (p1, p2)
const char *p1;
const char *p2;
{
register const unsigned char *s1 = (const unsigned char *) p1;
register const unsigned char *s2 = (const unsigned char *) p2;
unsigned reg_char c1, c2;
do
{
c1 = (unsigned char) *s1++;
c2 = (unsigned char) *s2++;
if (c1 == '\0')
return c1 - c2;
}
while (c1 == c2);
return c1 - c2;
}
```

**Musl** C library has a very compact implementation:

```
int strcmp(const char *l, const char *r)
{
for (; *l==*r && *l; l++, r++);
return *(unsigned char *)l - *(unsigned char *)r;
}
```

The **newlib** has this implementation:

```
int
_DEFUN (strcmp, (s1, s2),
_CONST char *s1 _AND
_CONST char *s2)
{
#if defined(PREFER_SIZE_OVER_SPEED) || defined(__OPTIMIZE_SIZE__)
while (*s1 != '\0' && *s1 == *s2)
{
s1++;
s2++;
}
return (*(unsigned char *) s1) - (*(unsigned char *) s2);
#else
// a more advanced approach, testing 4 bytes at a time, still returning the difference of bytes
#endif
}
```

Many alternative C libraries seem to follow the same pattern and return the difference of bytes, which matches the specification. But the implementations you tested seem to consistently return `-1`

, `0`

or `+1`

. Don't rely on this. It might change in future releases, or even with the same system using different compilation flags.

来源：`https://stackoverflow.com/questions/59779056/is-there-any-compiler-and-library-where-strcmp-returns-values-other-than-1-0`