Releases: wanghenshui/cppweeklynews
C++ 动态新闻推送 第52期
从reddit/hackernews/lobsters/meetingcpp摘抄一些c++动态
弄了个qq频道,手机qq点击进入
欢迎投稿,推荐或自荐文章/软件/资源等,请提交 issue
资讯
标准委员会动态/ide/编译器信息放在这里
编译器信息最新动态推荐关注hellogcc公众号 本周更新 2022-01-05 第139期
文章
数字转字符串 sprintf性能,非常垃圾,使用fm
t或者std::to_chars
namespace adl {
struct foo {};
void bar(foo) {}
}
int main() {
adl::foo foo;
bar(foo); // OK, ADL
(bar)(foo); // error: no ADL
}
就是名字空间内的查找
#include <utility>
int main() {
std::unreachable();
return 42; // invokes undefined behavior
}
没啥说的,类似assert(false)
int main() {
std::vector v{1, 2, 3, 4};
assert(4 == std::size(v));
std::erase_if(v, [](const auto& e) { return e % 2;} );
assert(2 == std::size(v));
assert(v[0] == 2 and v[1] == 4);
}
没啥说的
手把手教你图像过滤,看不懂
图像sample 看不懂
手把手教你gmock
几个需求,编译期打印字符串,编译期生成随机数
第一个简单,就是fixed_string+static assert
struct ct_str
{
char _data[512]{};
std::size_t _size{0};
template <std::size_t N>
constexpr ct_str(const char (&str)[N]) : _data{}, _size{N - 1}
{
for(std::size_t i = 0; i < _size; ++i)
_data[i] = str[i];
}
};
template <ct_str> struct print;
constexpr ct_str test()
{
ct_str s{"Welcome to Wordlexpr!"};
s._data[0] = 'w';
s._data[11] = 'w';
s._data[20] = '.';
return s;
}
print<test()> _{};
第二个就是用fixed_string来做生成随机数的生成器,让外部指定seed和字符串,然后编译的时候改一改就行了
语言律师新活,如果itor是T*,会有一大堆冲突问题,后面是一大堆列举。这里不提了
异常,我劝你别用
c++编译依赖和头文件的问题。属于老生常谈的讨论
google benchmark教程,几个案例
基本的阻止优化
static void i32_addition_semirandom(bm::State &state) {
int32_t a = std::rand(), b = std::rand(), c = 0;
for (auto _ : state)
bm::DoNotOptimize(c = (++a) + (++b));
}
一个简单的数学算法对比
static void f64_sin(bm::State &state) {
double argument = std::rand(), result = 0;
for (auto _ : state)
bm::DoNotOptimize(result = std::sin(argument += 1.0));
}
static void f64_sin_maclaurin(bm::State &state) {
double argument = std::rand(), result = 0;
for (auto _ : state) {
argument += 1.0;
result = argument - std::pow(argument, 3) / 6 + std::pow(argument, 5) / 120;
bm::DoNotOptimize(result);
}
}
static void f64_sin_maclaurin_powless(bm::State &state) {
double argument = std::rand(), result = 0;
for (auto _ : state) {
argument += 1.0;
result = argument - (argument * argument * argument) / 6.0 +
(argument * argument * argument * argument * argument) / 120.0;
bm::DoNotOptimize(result);
}
}
[[gnu::optimize("-ffast-math")]]
static void f64_sin_maclaurin_with_fast_math(bm::State &state) {
double argument = std::rand(), result = 0;
for (auto _ : state) {
argument += 1.0;
result = argument - (argument * argument * argument) / 6.0 +
(argument * argument * argument * argument * argument) / 120.0;
bm::DoNotOptimize(result);
}
}
注意这个attrbute用法。最后一种非常快
整数除法
static void i64_division(bm::State &state) {
int64_t a = std::rand(), b = std::rand(), c = 0;
for (auto _ : state)
bm::DoNotOptimize(c = (++a) / (++b));
}
static void i64_division_by_const(bm::State &state) {
int64_t money = 2147483647;
int64_t a = std::rand(), c;
for (auto _ : state)
bm::DoNotOptimize(c = (++a) / *std::launder(&money));
}
static void i64_division_by_constexpr(bm::State &state) {
constexpr int64_t b = 2147483647;
int64_t a = std::rand(), b;
for (auto _ : state)
bm::DoNotOptimize(c = (++a) / b);
}
constexpr非常快
硬件加速
[[gnu::target("default")]] static void u64_population_count(bm::State &state) {
auto a = static_cast<uint64_t>(std::rand());
for (auto _ : state)
bm::DoNotOptimize(__builtin_popcount(++a));
}
[[gnu::target("popcnt")]] static void u64_population_count_x86(bm::State &state) {
auto a = static_cast<uint64_t>(std::rand());
for (auto _ : state)
bm::DoNotOptimize(__builtin_popcount(++a));
}
如果硬件支持popcnt指令,有优化提升
里面有个表格概括,就不截图了
视频
还是模板 + fixed_string/array 这种场景,如果fixed_string/array是相同的,由于模板实例化相同的实例会合并,所以说这就是一种压缩效果,复用数据段。
就是static assert + type traits没啥说的
又介绍个parser generator
开源项目需要人手
- asteria 一个脚本语言,可嵌入,长期找人,希望胖友们帮帮忙,也可以加群384042845和作者对线
- pika 一个nosql 存储, redis over rocksdb,非常需要人贡献代码胖友们, 感兴趣的欢迎加群294254078前来对线
看到这里或许你有建议或者疑问或者指出错误,请留言评论! 多谢! 你的评论非常重要!也可以帮忙点赞收藏转发!多谢支持!
C++ 动态新闻推送 第51期
从reddit/hackernews/lobsters/meetingcpp摘抄一些c++动态
弄了个qq频道,手机qq点击进入
欢迎投稿,推荐或自荐文章/软件/资源等,请提交 issue
资讯
标准委员会动态/ide/编译器信息放在这里
推荐阅读 C++ exceptions are becoming more and more problematic
异常,太坑了
编译器信息最新动态推荐关注hellogcc公众号 本周更新 2022-02-23 第138期
文章
[Linux's getrandom() Sees A 8450% Improvement With Latest Code](https://git.kernel.org/pub/scm/linux/kernel/git/crng/random.git/log/)
替换了算法,使用black而不是sha1
[Chrome V8 源码 解读系列](https://www.zhihu.com/people/v8blink/posts)
这个人写了很多文章。对浏览器感兴趣的/业界人士可以关注一波。我不太懂就不多说了
[My favorite C++20 feature](https://schneide.blog/2022/02/21/my-favorite-c20-feature/)
这玩意, 确实挺方便
auto request = http_request{
.method = http_method::get,
.uri = "localhost:7634",
.headers = { { .name = "Authorization", .value = "Bearer TOKEN" } },
};
[c++反射深入浅出 - 1. ponder 反射实现分析总篇](https://zhuanlan.zhihu.com/p/471396674)
[c++反射深入浅出 - 2. property 实现分析](https://zhuanlan.zhihu.com/p/472265782)
解析ponder这个库。对于想学习反射的值得一看
[A Good Way to Handle Errors Is To Prevent Them from Happening in the First Place](https://www.fluentcpp.com/2022/02/25/a-good-way-to-handle-errors-is-to-prevent-them-from-happening-in-the-first-place/)
尽可能把错误处理掉或者用optional /expect / outcame包装处理掉
[Returning values and errors](https://rachelbythebay.com/w/2022/02/20/return/)
string* UserIP(); //1
string UserIP(string* errmsg); //2
bool GetUserIP(string* ip); //3
bool GetUserIP(string* ip, string* errmsg); //4
Result UserIP(); //5
ResultString UserIP(); //6
string UserIP(); //7
大家觉得哪个接口好?
1肯定不行,2 3 4都需要传进个string处理,比较脏, 5是不是太复杂了,6是简单版本,但是会不会又有ResultDouble之类的东西?7简单,只有ip,errmsg不放进去,也许这个才是最优解?
开放题,没有答案
[Implementing the FLIP algorithm](https://www.jeremyong.com/color%20theory/2022/02/19/implementing-the-flip-algorithm/)
图形学的东西,不太懂,这里标记TODO
[Ways to Refactor Toggle/Boolean Parameters in C++](https://www.cppstories.com/2017/03/on-toggle-parameters/)
DoImportantStuff(true, false, true, false);
我们都知道这种参数会有莫名其妙的问题,丢失值的信息,一个两个倒还好,多了难免眼花,怎么重构,封装成enum
enum class UseCacheFlag { False, True };
enum class DeferredFlag { False, True };
enum class OptimizeFlag { False, True };
enum class FinalRenderFlag { False, True };
// and call like:
RenderGlyphs(glyphs,
UseCacheFlag::True,
DeferredFlag::False,
OptimizeFlag::True,
FinalRenderFlag::False);
使用bit flag
#include <type_traits>
struct Glyphs { };
enum class RenderGlyphsFlags
{
useCache = 1,
deferred = 2,
optimize = 4,
finalRender = 8,
};
// simplification...
RenderGlyphsFlags operator | (RenderGlyphsFlags a, RenderGlyphsFlags b) {
using T = std::underlying_type_t ;
return static_cast(static_cast(a) | static_cast(b));
// todo: missing check if the new value is in range...
}
constexpr bool IsSet(RenderGlyphsFlags val, RenderGlyphsFlags check) {
using T = std::underlying_type_t ;
return static_cast(val) & static_cast(check);
// todo: missing additional checks...
}
void RenderGlyphs(Glyphs &glyphs, RenderGlyphsFlags flags)
{
if (IsSet(flags, RenderGlyphsFlags::useCache)) { }
else { }
if (IsSet(flags, RenderGlyphsFlags::deferred)) { }
else { }
// ...
}
int main() {
Glyphs glyphs;
RenderGlyphs(glyphs, RenderGlyphsFlags::useCache | RenderGlyphsFlags::optimize);
}
结构体
struct RenderGlyphsParam
{
bool useCache;
bool deferred;
bool optimize;
bool finalRender;
};
void RenderGlyphs(Glyphs &glyphs, const RenderGlyphsParam &renderParam);
// the call:
RenderGlyphs(glyphs,
{/useCache/true,
/deferred/false,
/optimize/true,
/finalRender/false});
c++20我们有了字段构造,字段信息终于有了
struct RenderGlyphsParam
{
bool useCache;
bool deferred;
bool optimize;
bool finalRender;
};
void RenderGlyphs(Glyphs &glyphs, const RenderGlyphsParam &renderParam);
// the call:
RenderGlyphs(glyphs,
{.useCache = true,
.deferred = false,
.optimize = true,
.finalRender = false});
这个更完美一些
[Supervising in C++: how to make your programs reliable](https://basiliscos.github.io/blog/2022/02/20/supervising-in-c-how-to-make-your-programs-reliable/)[](https://github.com/wanghenshui/cppweeklynews/blob/dev/posts/051.md#%E8%A7%86%E9%A2%91)
介绍c++一些Supervise管理策略以及actor框架使用,比较少用。基本上都是糊一个taskflow模型,不用什么let it crash。这种东西放在背后的管理系统来做。不在业务进程里做
视频
C++ Weekly - Ep 312 - Stop Using constexpr (And Use This Instead!)
constexpr修饰函数,没问题
constexpr修饰值,这个值未必是编译期计算(用const可以),取决于编译器,且 constexpr修饰的值肯定在堆栈,所以要注意作用域问题
[Keynote: C++'s Superpower - Matt Godbolt - CPPP 2021](https://www.youtube.com/watch?v=0_UttFDnV3k)
介绍周边生态
[Introduction to memory exploitation - Patricia Aas - Meeting C++ 2021](https://www.youtube.com/watch?v=s18lHhN-NXc)
讲fuzzer的工作原理
[Design of a C++ reflection API - Matúš Chochlík - Meeting C++ online](https://www.youtube.com/watch?v=BP0gsVy502w)
介绍他写的一个反射库
[The Basics of Profiling - Mathieu Ropert - CppCon 2021](https://www.youtube.com/watch?v=dToaepIXW4s)
没啥意思。讲window profile的
[Design and Implementation of Highly Scalable Quantifiable Data Structures in C++ - CppCon 2021](https://www.youtube.com/watch?v=ECWsLj0pgbI&list=PLHTh1InhhwT6vjwMy3RG5Tnahw0G9qIx6&index=74)[](https://github.com/wanghenshui/cppweeklynews/blob/dev/posts/051.md#%E5%BC%80%E6%BA%90%E9%A1%B9%E7%9B%AE%E9%9C%80%E8%A6%81%E4%BA%BA%E6%89%8B)
这讲的是个啥啊?论文在Parallel Computing Technologies这本书里,谁能搞个电子版,原版太贵了。愣是没听明白。这里标记TODO,有机会再看吧
开源项目需要人手
[asteria](https://github.com/lhmouse/asteria) 一个脚本语言,可嵌入,长期找人,希望胖友们帮帮忙,也可以加群384042845和作者对线
[pika](https://github.com/OpenAtomFoundation/pika)[](https://github.com/wanghenshui/cppweeklynews/blob/dev/posts/051.md#%E6%96%B0%E9%A1%B9%E7%9B%AE%E4%BB%8B%E7%BB%8D%E7%89%88%E6%9C%AC%E6%9B%B4%E6%96%B0) 一个nosql 存储, redis over rocksdb,非常需要人贡献代码胖友们, 感兴趣的欢迎加群294254078前来对线
新项目介绍/版本更新
[raw pdb](https://github.com/MolecularMatters/raw_pdb) c++17一个解析pdb的库
[ledit](https://github.com/liz3/ledit) 一个编辑器
[HFSM2 development might slow down](https://www.reddit.com/r/cpp/comments/t0od6u/hfsm2_development_might_slow_down/) 乌克兰正在打仗,作为当地人无心工作
[thread-pool](https://github.com/DeveloperPaul123/thread-pool) 又一个线程池实现
实现线程池我们真正需要的是什么?是一个干活线程还是任务的投递/管理?纯纯一个线程池轮子也就看看,用处不大
C++ 动态新闻推送 第50期
从[reddit](https://www.reddit.com/r/cpp/)/[[hackernews](https://news.ycombinator.com/)](https://news.ycombinator.com/)/[[lobsters](https://lobste.rs/)](https://lobste.rs/)/[[meetingcpp](https://www.meetingcpp.com/blog/blogroll/items/Meeting-Cpp-Blogroll-317.html)](https://www.meetingcpp.com/blog/blogroll/items/Meeting-Cpp-Blogroll-317.html)摘抄一些c++动态
[周刊项目地址](https://github.com/wanghenshui/cppweeklynews)|[[在线地址](https://wanghenshui.github.io/cppweeklynews/)](https://wanghenshui.github.io/cppweeklynews/) |[知乎专栏](https://www.zhihu.com/column/jieyaren) |[腾讯云+社区](https://cloud.tencent.com/developer/column/92884)
弄了个qq频道,[手机qq点击进入](https://qun.qq.com/qqweb/qunpro/share?_wv=3&_wwv=128&inviteCode=xzjHQ&from=246610&biz=ka)
欢迎投稿,推荐或自荐文章/软件/资源等,请[提交 issue](https://github.com/wanghenshui/cppweeklynews/issues)
资讯
标准委员会动态/ide/编译器信息放在这里
c++ summit在上海要开,三月份,两天套票接近六千,真心贵,这价格,比cppcon还贵
[Visual Studio 2022 17.1 is now available!](https://devblogs.microsoft.com/visualstudio/visual-studio-2022-17-1-is-now-available/)
[编译器信息最新动态推荐关注hellogcc公众号 本周更新 2022-02-16 第137期](https://github.com/hellogcc/osdt-weekly/blob/master/weekly-2022/2022-02-16.md)
文章
- [Did you know that C++23 added Attributes on Lambda-Expressions?](https://github.com/QuantlabFinancial/cpp_tip_of_the_week/)
constexpr auto foo = [] [[deprecated]] { };
int main() {
foo(); // operator() is deprecated
}
Lambda 可以标注
主要是利用clang的 -ftime-trace
参数
我记得gcc也有一个类似的找不到了
- [C++ Templates: How to Iterate through std::tuple: std::apply and More](https://www.cppstories.com/2022/tuple-iteration-apply/)
承接上文啊,能实现遍历打印,肯定也能实现遍历调用lambda,如何实现呢?
核心代码,之前的index_sequence搬过来,另外还需要展开变参模版
for_each_tuple和之前的printtuple类似,for_each_tuple2避免难理解,主要是依赖lambda的模版能力,也是要展开变参模版
template <typename TupleT, typename Fn, std::size_t... Is>
void for_each_tuple_impl(TupleT&& tp, Fn&& fn, std::index_sequence<Is...>) {
(fn(std::get<Is>(std::forward<TupleT>(tp))), ...);
}
template <typename TupleT, typename Fn, std::size_t TupSize = std::tuple_size_v<std::remove_cvref_t<TupleT>>>
void for_each_tuple(TupleT&& tp, Fn&& fn) {
for_each_tuple_impl(std::forward<TupleT>(tp), std::forward<Fn>(fn), std::make_index_sequence<TupSize>{});
}
template <typename TupleT, typename Fn>
void for_each_tuple2(TupleT&& tp, Fn&& fn) {
std::apply
(
[&fn]<typename ...T>(T&& ...args)
{
(fn(std::forward<T>(args)), ...);
}, std::forward<TupleT>(tp)
);
}
- [Constant references are not always your friends](https://belaycpp.com/2022/02/15/constant-references-are-not-always-your-friends/)
虽然一般来说不需要拷贝的传参数用const T&就万事大吉,但是有些场景是不行的,比如T的设计不合理
我们要考虑T设计的问题,另外小对象,不要用const T&,比如string_view span int这种 直接传value
-
[c++ execution 与 coroutine (五):异步 - 知乎 (zhihu.com)](https://zhuanlan.zhihu.com/p/441741987)
[c++ execution 与 coroutine (六):coroutine概述 - 知乎 (zhihu.com)](https://zhuanlan.zhihu.com/p/443847625)
[c++ execution 与 coroutine (七):awaiter也是sender](https://zhuanlan.zhihu.com/p/445943412)
突然得知executor进不了c++23了,哎可惜。这些概念了解一下还是可以的。抽象程度很高
- [C++ Trailing Return Types](https://www.danielsieger.com/blog/2022/01/28/cpp-trailing-return-types.html)
讨论了一下把返回值放到后面的可行性,主要原因是作者开发的库经常会遇到这个返回值类型不确定的场景,比如
template<typename A, typename B>
decltype(std::declval<A>() * std::declval<B>()) multiply(A a, B b) { return a*b; }
- [The 114 standard C++ algorithms. Introduction](https://itnext.io/the-114-standard-c-algorithms-introduction-2a75a2df4300)
标准库的算法,了解一下
考虑一种场景,成员函数修饰调用限定
void Foo::bar() & { /* ... */ }
void Foo::bar() && { /* ... */ }
void Foo::bar() const & { /* ... */ }
void Foo::bar() const && { /* ... */ }
后面这种场景是为了限定Foo在某些类型的场景下才能调用
有了deducing this就能简化。举个例子
template <typename T>
class OptionalNotDeducingThis {
// ...
constexpr T* operator->() {
return addressof(this->m_value);
}
constexpr T const*
operator->() const {
return addressof(this->m_value);
}
// ...
};
template <typename T>
class OptionalDeducingThis {
// ...
template <typename Self>
constexpr auto operator->(this Self&& self) {
return addressof(self.m_value);
}
// ...
};
this Self来决定auto,所以你要const就 T* const,你不const的就T*
- [Faster integer formatting - James Anhalt (jeaiii)’s algorithm](https://jk-jeon.github.io/posts/2022/02/jeaiii-algorithm/)
一个证书序列化成字符串的算法(itoa)比fmt库内部的算法还要快,不过fmt作者没有考虑使用这个算法。
这里简单介绍一下
最简单的写法
char* itoa_naive(std::uint32_t n, char* buffer) {
char temp[10];
char* ptr = temp + sizeof(temp) - 1;
while (n >= 10) {
*ptr = char('0' + (n % 10));
n /= 10;
--ptr;
}
*ptr = char('0' + n);
auto length = temp + sizeof(temp) - ptr;
std::memcpy(buffer, ptr, length);
return buffer + length;
}
把整数序列化到temp数组,再拷贝出去,buf是10是因为int32就那么大
显然循环除10很慢,我们可以考虑减少循环次数,然后考虑除100
然后直接把余数给算好
首先想到的优化就是查表写数,而不是计算
static constexpr char radix_100_table[] = {
'0', '0', '0', '1', '0', '2', '0', '3', '0', '4',
'0', '5', '0', '6', '0', '7', '0', '8', '0', '9',
'1', '0', '1', '1', '1', '2', '1', '3', '1', '4',
'1', '5', '1', '6', '1', '7', '1', '8', '1', '9',
'2', '0', '2', '1', '2', '2', '2', '3', '2', '4',
'2', '5', '2', '6', '2', '7', '2', '8', '2', '9',
'3', '0', '3', '1', '3', '2', '3', '3', '3', '4',
'3', '5', '3', '6', '3', '7', '3', '8', '3', '9',
'4', '0', '4', '1', '4', '2', '4', '3', '4', '4',
'4', '5', '4', '6', '4', '7', '4', '8', '4', '9',
'5', '0', '5', '1', '5', '2', '5', '3', '5', '4',
'5', '5', '5', '6', '5', '7', '5', '8', '5', '9',
'6', '0', '6', '1', '6', '2', '6', '3', '6', '4',
'6', '5', '6', '6', '6', '7', '6', '8', '6', '9',
'7', '0', '7', '1', '7', '2', '7', '3', '7', '4',
'7', '5', '7', '6', '7', '7', '7', '8', '7', '9',
'8', '0', '8', '1', '8', '2', '8', '3', '8', '4',
'8', '5', '8', '6', '8', '7', '8', '8', '8', '9',
'9', '0', '9', '1', '9', '2', '9', '3', '9', '4',
'9', '5', '9', '6', '9', '7', '9', '8', '9', '9'
};
char* itoa_two_digits_per_div(std::uint32_t n, char* buffer) {
char temp[8];
char* ptr = temp + sizeof(temp);
while (n >= 100) {
ptr -= 2;
std::memcpy(ptr, radix_100_table + (n % 100) * 2, 2);
n /= 100;
}
if (n >= 10) {
std::memcpy(buffer, radix_100_table + n * 2, 2);
buffer += 2;
}
else {
buffer[0] = char('0' + n);
buffer += 1;
}
auto remaining_length = temp + sizeof(temp) - ptr;
std::memcpy(buffer, ptr, remaining_length);
return buffer + remaining_length;
}
说实话这块我就看不懂了,后面更难了。这里标记个TODO,有时间研究一下
作者的一些优化经验,我看lamire老哥也关注了。写的很有噱头
- 2x faster GCD (compared to
std::gcd
)- 8-15x faster binary search (compared to
std::lower_bound
)- 7x faster segment trees
- 5x faster hash tables (compared to
std::unordered_map
)?x faster popcount- 2x faster parsing series of integers (compared to
scanf
)- ?x faster sorting (compared to
std::sort
)- 2x faster sum (compared to
std::accumulate
)- 10x faster array searching (compared to
std::find
)- 100x faster matrix multiplication (compared to “for-for-for”)
- optimal word-size integer factorization (~0.4ms per 60-bit integer)
- optimal Karatsuba Algorithm
- optimal FFT
- argmin at the speed of memory
文章很长一时半会看不完,这里先标记TODO了
- [Projections are Function Adaptors](https://brevzin.github.io/c++/2022/02/13/projections-function-adaptors/)
struct Person {
std::string first;
std::string last;
};
std::vector<Person> people = { /* ... */ };
std::vector<std::string> r_names;
std::ranges::copy_if(
people,
std::back_inserter(r_names),
[](std::string const& s) { return s[0] == 'R'; },
&Person::lastCommand Line Flags in C++
A Minimalist's Guide);
std::ranges::copy_if(
people | std::views::transform(&Person::last),
std::back_inserter(r_names),
[](std::string const& s) { return s[0] == 'R'; });
看懂这两段代码的区别了吗,第一段代码不工作,因为
视频
没啥说的
- [[SIMD algorithms]...
第49期
layout: post
title: 第49期
C++ 动态新闻推送 第49期
从reddit/hackernews/lobsters/meetingcpp1 meetingcpp2摘抄一些c++动态
弄了个qq频道,手机qq点击进入
欢迎投稿,推荐或自荐文章/软件/资源等,请提交 issue
资讯
标准委员会动态/ide/编译器信息放在这里
编译器信息最新动态推荐关注hellogcc公众号 本周更新 2022-02-02 第135期 2022-02-09 第136期
文章
#include <bit>
int main() {
constexpr auto value = std::uint16_t(0xCAFE);
std::cout << std::hex << value; // pritns cafe
std::cout << std::hex << std::byteswap(value); // prints feca
}
没啥说的
#define VARIADIC(...) __VA_OPT__(__LINE__)
VARIADIC() // `empty`
VARIADIC(a) // `line` 4
VARIADIC(a, b) // `line` 5
效果 https://godbolt.org/z/rsj9ax7xY
#define FOO(...) printf(__VA_ARGS__)
#define BAR(fmt, ...) printf(fmt, __VA_ARGS__)
FOO("this works fine");
BAR("this breaks!");
最后一行,会多出一个逗号,导致调用失败,如何吃掉这个逗号?
gcc拓展
#define BAR(fmt, ...) printf(fmt "\n", ##__VA_ARGS__)
BAR("here is a log message");
BAR("here is a log message with a param: %d", 42);
或者用这个__VA_OPT__
感觉boost.pp 里有这玩意。
另外,如何检查这个宏的编译器支持?看这里
#define PP_THIRD_ARG(a,b,c,...) c
#define VA_OPT_SUPPORTED_I(...) PP_THIRD_ARG(__VA_OPT__(,),true,false,)
#define VA_OPT_SUPPORTED VA_OPT_SUPPORTED_I(?)
PP_THIRD_ARG只要第三个参数,__VA_OPT__
支持的话展开PP_THIRD_ARG(__VA_OPT__(,),true,false,)
变成PP_THIRD_ARG(,,true,false,)
第三个就是true,不展开VA_OPT第三个就是false,挺有意思的
看个乐,杀鸡用牛刀了属于是
之前也介绍过,就是解析数字字符串的方法,如何做更快,点击回顾
这是其中之一SWAR,这里老博士重新讲一遍原理
最最简单版本
uint32_t parse_eight_digits(const unsigned char *chars) {
uint32_t x = chars[0] - '0';
for (size_t j = 1; j < 8; j++)
x = x * 10 + (chars[j] - '0');
return x;
}
这里不考虑合法性,不校验
一般来说,编译器循环展开会这样
movzx eax, byte ptr [rdi]
lea eax, [rax + 4*rax]
movzx ecx, byte ptr [rdi + 1]
lea eax, [rcx + 2*rax]
lea eax, [rax + 4*rax]
movzx ecx, byte ptr [rdi + 2]
lea eax, [rcx + 2*rax]
lea eax, [rax + 4*rax]
movzx ecx, byte ptr [rdi + 3]
lea eax, [rcx + 2*rax]
lea eax, [rax + 4*rax]
movzx ecx, byte ptr [rdi + 4]
lea eax, [rcx + 2*rax]
lea eax, [rax + 4*rax]
movzx ecx, byte ptr [rdi + 5]
lea eax, [rcx + 2*rax]
lea eax, [rax + 4*rax]
movzx ecx, byte ptr [rdi + 6]
lea eax, [rcx + 2*rax]
lea eax, [rax + 4*rax]
movzx ecx, byte ptr [rdi + 7]
lea eax, [rcx + 2*rax]
add eax, -533333328
很多都是相同的指令,整理一下
imul rax, qword ptr [rdi], 2561
movabs rcx, -1302123111085379632
add rcx, rax
shr rcx, 8
movabs rax, 71777214294589695
and rax, rcx
imul rax, rax, 6553601
shr rax, 16
movabs rcx, 281470681808895
and rcx, rax
movabs rax, 42949672960001
imul rax, rcx
shr rax, 32
我们的代码如何直接生成后面这种汇编?
SWAR SIMD within a register.其实就是让寄存器尽可能利用上,做更多的计算,从上面这个汇编就能看出来
为了达到省计算的目标,就要8个byte同时做算术
接下来就是构造了
与其一个一个的减 '\0' 不如直接整个减0x30
val = val - 0x3030303030303030;
如果你的数字串是12345678,那对应16进值就是0x0807060504030201,那
然后乘
val = (val * 10) + (val >> 8);
最终效果是这样
uint32_t parse_eight_digits_unrolled(uint64_t val) {
const uint64_t mask = 0x000000FF000000FF;
const uint64_t mul1 = 0x000F424000000064; // 100 + (1000000ULL << 32)
const uint64_t mul2 = 0x0000271000000001; // 1 + (10000ULL << 32)
val -= 0x3030303030303030;
val = (val * 10) + (val >> 8); // val = (val * 2561) >> 8;
val = (((val & mask) * mul1) + (((val >> 16) & mask) * mul2)) >> 32;
return val;
}
其实别的地方也有这种技巧,比如
public static int bitCount(int i) {
// HD, Figure 5-2
i = i - ((i >>> 1) & 0x55555555);
i = (i & 0x33333333) + ((i >>> 2) & 0x33333333);
i = (i + (i >>> 4)) & 0x0f0f0f0f;
i = i + (i >>> 8);
i = i + (i >>> 16);
return i & 0x3f;
}
原理 HD就是Hacker's Delight这本书的意思
其实主要构造是最难的要考虑用算数拼出来
找平均值,且不溢出
unsigned average(unsigned a, unsigned b)
{
return (a + b) / 2;
}
这个明显会溢出
如果你知道两个数的大小的话
unsigned average(unsigned low, unsigned high)
{
return low + (high - low) / 2;
}
也有一种不需要知道大小的方法
unsigned average(unsigned a, unsigned b)
{
return (a / 2) + (b / 2) + (a & b & 1);
}
当然SWAR方法更快
unsigned average(unsigned a, unsigned b)
{
return (a & b) + (a ^ b) / 2;
}
原理a + b = ((a & b) << 1) + (a ^ b) 两部分分别是头和尾
作者还讨论了不同平台下的实现方法。喜欢扣细节的可以看看
另外推荐观看CppCon 2019: Marshall Clow “std::midpoint? How Hard Could it Be?”
写了个brainfuck编译器,用上constexpr和编译器优化,不知道brainfuck的先百度下,这里直接贴代码
一个实现
enum class op
{
ptr_inc, // >
ptr_dec, // <
data_inc, // +
data_dec, // -
write, // .
read, // ,
jmp_ifz, // [, jump if zero
jmp, // ], unconditional jump
};
template <std::size_t InstructionCapacity>
struct program
{
std::size_t inst_count;
op inst[InstructionCapacity];
std::size_t inst_jmp[InstructionCapacity];
};
template <std::size_t InstructionCapacity>
void execute(const program<InstructionCapacity>& program,
unsigned char* data_ptr)
{
auto inst_ptr = std::size_t(0);
while (inst_ptr < program.inst_count)
{
switch (program.inst[inst_ptr])
{
case op::ptr_inc:
++data_ptr;
++inst_ptr;
break;
case op::ptr_dec:
--data_ptr;
++inst_ptr;
break;
case op::data_inc:
++*data_ptr;
++inst_ptr;
break;
case op::data_dec:
--*data_ptr;
++inst_ptr;
break;
case op::write:
std::putchar(*data_ptr);
++inst_ptr;
break;
case op::read:
*data_ptr = static_cast<unsigned char>(std::getchar());
++inst_ptr;
break;
case op::jmp_ifz:
if (*data_ptr == 0)
inst_ptr = program.inst_jmp[inst_ptr];
else
++inst_ptr;
break;
case op::jmp:
inst_ptr = program.inst_jmp[inst_ptr];
break;
}
}
}
template <std::size_t N>
constexpr auto parse(const char (&str)[N])
{
program<N> result{};
std::size_t jump_stack[N] = {};
std::size_t jump_stack_top = 0;
for (auto ptr = str; *ptr; ++ptr)
{
if (*ptr == '>')
result.inst[result.inst_count++] = op::ptr_inc;
else if (*ptr == '<')
result.inst[result.inst_count++] = op::ptr_dec;
else if (*ptr == '+')
result.inst[result.inst_count++] = op::data_inc;
else if (*ptr == '-')
result.inst[result.inst_count++] = op::data_dec;
else if (*ptr == '.')
result.inst[result.inst_count++] = op::write;
else if (*ptr == ',')
result.inst[result.inst_count++] = op::read;
else if (*ptr == '[')
{
jump_stack[jump_stack_top++] = result.inst_count;
result.inst[result.inst_count++] = op::jmp_ifz;
}
else if (*ptr == ']')
{
auto open = jump_stack[--jump_stack_top];
auto close = result.inst_count++;
result.inst[close] = op::jmp;
result.inst_jmp[close] = open;
result.inst_jmp[open] = close + 1;
}
}
return result;
}
如何使用?
// `x = std::getchar(); y = x + 3; std::putchar(y);`
static constexpr auto add3 = parse(",>+++<[->+<]>.");
// Use this array for our data_ptr.
unsigned char memory[1024] = {};
execute(add3, memory);
不是很难
如果想玩ji...