String Length

Getting String Length

You can determine the number of characters in a string using the length() or size() methods (they are equivalent in C++). Both run in O(1) time and return std::string::size_type (an unsigned type).

Example

#include <iostream>
#include <string>

int main() {
    std::string str = "Hello";
    std::cout << "Length: " << str.length() << '\n';
    std::cout << "Size: " << str.size();
    return 0;
}

Output

Length: 5
Size: 5

Empty Strings

You can check if a string has no characters using the empty() method, which returns true (1) if the string is empty.

Example

#include <iostream>
#include <string>

int main() {
    std::string s1 = "Content";
    std::string s2;

    std::cout << "s1 empty? " << s1.empty() << '\n';
    std::cout << "s2 empty? " << s2.empty();
    return 0;
}

Output

s1 empty? 0
s2 empty? 1

Length vs Capacity

A string stores both its length (the number of code units it currently holds) and its capacity (the amount of memory allocated). Capacity may be larger than length to reduce reallocations when the string grows.

Example

#include <iostream>
#include <string>

int main() {
    std::string str;
    std::cout << "Initial capacity: " << str.capacity() << '\n';

    str = "This is a longer string";
    std::cout << "Length: " << str.length() << '\n';
    std::cout << "Capacity: " << str.capacity() << '\n';

    // Tip: reserve to avoid repeated reallocations
    str.clear();
    str.reserve(100);
    std::cout << "After reserve(100), capacity: " << str.capacity() << '\n';
    return 0;
}

Output

Initial capacity: 15
Length: 22
Capacity: 30
After reserve(100), capacity: 100

ℹ️ Note: Capacity varies across compilers/standard libraries and may differ from the numbers shown. reserve() is a request, but usually grows capacity to at least the requested size.

Unicode Caveat (UTF-8)

In a narrow std::string, size()/length() count bytes, not human-visible characters. With UTF-8 text, multi-byte code points (e.g., emoji) make size larger than the perceived character count.

If you need code-point counts in UTF-8, you can approximate by counting non-continuation bytes; for true user-perceived characters (grapheme clusters), a text library is required.

Example

#include <iostream>
#include <string>

// Count UTF-8 code points by counting bytes that are not 10xxxxxx (continuations)
std::size_t utf8_codepoints(const std::string& s) {
    std::size_t n = 0;
    for (unsigned char c : s) {
        if ((c & 0xC0) != 0x80) ++n; // not a continuation byte
    }
    return n;
}

int main() {
    std::string s = u8"😀a"; // U+1F600 + 'a' -> 2 code points, 5 bytes
    std::cout << "bytes: " << s.size() << '\n';
    std::cout << "code points (approx): " << utf8_codepoints(s) << '\n';
    return 0;
}

Output

bytes: 5
code points (approx): 2

ℹ️ Note: Grapheme clusters (what users perceive as one character) can consist of multiple code points—use specialized libraries if you need that level of accuracy.

Best Practices

- Prefer '\n' over std::endl to avoid needless flushing.

- Remember size() is unsigned; take care when subtracting or comparing with signed values.

- Use reserve() to pre-allocate when you know growth ahead of time; shrink_to_fit() is non-binding.

- For international text, be explicit about encodings; consider std::u8string/std::u32string or a Unicode library for advanced needs.

String Length

Getting String Length

Example

#include <iostream>
#include <string>

int main() {
    std::string str = "Hello";
    std::cout << "Length: " << str.length() << '\n';
    std::cout << "Size: " << str.size();
    return 0;
}

Output

Length: 5
Size: 5

Empty Strings

You can check if a string has no characters using the empty() method, which returns true (1) if the string is empty.

Example

#include <iostream>
#include <string>

int main() {
    std::string s1 = "Content";
    std::string s2;

    std::cout << "s1 empty? " << s1.empty() << '\n';
    std::cout << "s2 empty? " << s2.empty();
    return 0;
}

Output

s1 empty? 0
s2 empty? 1

Length vs Capacity

Example

#include <iostream>
#include <string>

int main() {
    std::string str;
    std::cout << "Initial capacity: " << str.capacity() << '\n';

    str = "This is a longer string";
    std::cout << "Length: " << str.length() << '\n';
    std::cout << "Capacity: " << str.capacity() << '\n';

    // Tip: reserve to avoid repeated reallocations
    str.clear();
    str.reserve(100);
    std::cout << "After reserve(100), capacity: " << str.capacity() << '\n';
    return 0;
}

Output

Initial capacity: 15
Length: 22
Capacity: 30
After reserve(100), capacity: 100

ℹ️ Note: Capacity varies across compilers/standard libraries and may differ from the numbers shown. reserve() is a request, but usually grows capacity to at least the requested size.

Unicode Caveat (UTF-8)

In a narrow std::string, size()/length() count bytes, not human-visible characters. With UTF-8 text, multi-byte code points (e.g., emoji) make size larger than the perceived character count.

If you need code-point counts in UTF-8, you can approximate by counting non-continuation bytes; for true user-perceived characters (grapheme clusters), a text library is required.

Example

#include <iostream>
#include <string>

// Count UTF-8 code points by counting bytes that are not 10xxxxxx (continuations)
std::size_t utf8_codepoints(const std::string& s) {
    std::size_t n = 0;
    for (unsigned char c : s) {
        if ((c & 0xC0) != 0x80) ++n; // not a continuation byte
    }
    return n;
}

int main() {
    std::string s = u8"😀a"; // U+1F600 + 'a' -> 2 code points, 5 bytes
    std::cout << "bytes: " << s.size() << '\n';
    std::cout << "code points (approx): " << utf8_codepoints(s) << '\n';
    return 0;
}

Output

bytes: 5
code points (approx): 2

ℹ️ Note: Grapheme clusters (what users perceive as one character) can consist of multiple code points—use specialized libraries if you need that level of accuracy.

Best Practices

- Prefer '\n' over std::endl to avoid needless flushing.

- Remember size() is unsigned; take care when subtracting or comparing with signed values.

- Use reserve() to pre-allocate when you know growth ahead of time; shrink_to_fit() is non-binding.

- For international text, be explicit about encodings; consider std::u8string/std::u32string or a Unicode library for advanced needs.

C++ Basics

C++ Functions

C++ Classes

C++ Quiz

String Length

Getting String Length

Empty Strings

Length vs Capacity

Unicode Caveat (UTF-8)

Best Practices

String Length

Getting String Length

Empty Strings

Length vs Capacity

Unicode Caveat (UTF-8)

Best Practices