计算机科学汇总学术速递[1.10]
Update!H5支持摘要折叠,体验更佳!点击阅读原文访问arxivdaily.com,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏等功能!
cs计算机科学汇总,共计146篇
【1】 Embodied Hands: Modeling and Capturing Hands and Bodies Together
标题:具体化的手:一起建模和捕捉手和身体
作者:Javier Romero,Dimitrios Tzionas,Michael J. Black
备注:None
摘要:Humans move their hands and bodies together to communicate and solve tasks.
Capturing and replicating such coordinated activity is critical for virtual
characters that behave realistically. Surprisingly, most methods treat the 3D
modeling and tracking of bodies and hands separately. Here we formulate a model
of hands and bodies interacting together and fit it to full-body 4D sequences.
When scanning or capturing the full body in 3D, hands are small and often
partially occluded, making their shape and pose hard to recover. To cope with
low-resolution, occlusion, and noise, we develop a new model called MANO (hand
Model with Articulated and Non-rigid defOrmations). MANO is learned from around
1000 high-resolution 3D scans of hands of 31 subjects in a wide variety of hand
poses. The model is realistic, low-dimensional, captures non-rigid shape
changes with pose, is compatible with standard graphics packages, and can fit
any human hand. MANO provides a compact mapping from hand poses to pose blend
shape corrections and a linear manifold of pose synergies. We attach MANO to a
standard parameterized 3D body shape model (SMPL), resulting in a fully
articulated body and hand model (SMPL+H). We illustrate SMPL+H by fitting
complex, natural, activities of subjects captured with a 4D scanner. The
fitting is fully automatic and results in full body models that move naturally
with detailed hand motions and a realism not seen before in full body
performance capture. The models and data are freely available for research
purposes in our website (http://mano.is.tue.mpg.de).
【2】 Generalized Category Discovery
标题:广义范畴发现
作者:Sagar Vaze,Kai Han,Andrea Vedaldi,Andrew Zisserman
备注:13 pages, 6 figures
摘要:In this paper, we consider a highly general image recognition setting
wherein, given a labelled and unlabelled set of images, the task is to
categorize all images in the unlabelled set. Here, the unlabelled images may
come from labelled classes or from novel ones. Existing recognition methods are
not able to deal with this setting, because they make several restrictive
assumptions, such as the unlabelled instances only coming from known - or
unknown - classes and the number of unknown classes being known a-priori. We
address the more unconstrained setting, naming it 'Generalized Category
Discovery', and challenge all these assumptions. We first establish strong
baselines by taking state-of-the-art algorithms from novel category discovery
and adapting them for this task. Next, we propose the use of vision
transformers with contrastive representation learning for this open world
setting. We then introduce a simple yet effective semi-supervised $k$-means
method to cluster the unlabelled data into seen and unseen classes
automatically, substantially outperforming the baselines. Finally, we also
propose a new approach to estimate the number of classes in the unlabelled
data. We thoroughly evaluate our approach on public datasets for generic object
classification including CIFAR10, CIFAR100 and ImageNet-100, and for
fine-grained visual recognition including CUB, Stanford Cars and Herbarium19,
benchmarking on this new setting to foster future research.
【3】 Detecting Twenty-thousand Classes using Image-level Supervision
标题:利用图像级监控检测2万个班级
作者:Xingyi Zhou,Rohit Girdha,Armand Joulin,Phillip Krähenbühl,Ishan Misra
备注:Code is available at this https URL
摘要:Current object detectors are limited in vocabulary size due to the small
scale of detection datasets. Image classifiers, on the other hand, reason about
much larger vocabularies, as their datasets are larger and easier to collect.
We propose Detic, which simply trains the classifiers of a detector on image
classification data and thus expands the vocabulary of detectors to tens of
thousands of concepts. Unlike prior work, Detic does not assign image labels to
boxes based on model predictions, making it much easier to implement and
compatible with a range of detection architectures and backbones. Our results
show that Detic yields excellent detectors even for classes without box
annotations. It outperforms prior work on both open-vocabulary and long-tail
detection benchmarks. Detic provides a gain of 2.4 mAP for all classes and 8.3
mAP for novel classes on the open-vocabulary LVIS benchmark. On the standard
LVIS benchmark, Detic reaches 41.7 mAP for all classes and 41.7 mAP for rare
classes. For the first time, we train a detector with all the
twenty-one-thousand classes of the ImageNet dataset and show that it
generalizes to new datasets without fine-tuning. Code is available at
https://github.com/facebookresearch/Detic.
【4】 Wavenumber-explicit hp-FEM analysis for Maxwell's equations with impedance boundary conditions
作者:Jens M. Melenk,Stefan A. Sauter
备注:80 pages, 6 figures
摘要:The time-harmonic Maxwell equations at high wavenumber k in domains with an
analytic boundary and impedance boundary conditions are considered. A
wavenumber-explicit stability and regularity theory is developed that
decomposes the solution into a part with finite Sobolev regularity that is
controlled uniformly in k and an analytic part. Using this regularity,
quasi-optimality of the Galerkin discretization based on Nedelec elements of
order p on a mesh with mesh size h is shown under the k-explicit scale
resolution condition that a) kh/p is sufficient small and b) p/\ln k is bounded
from below.
【5】 Apples and Cars: a Comparison of Security
标题:苹果和汽车:安全性的比较
作者:Zhendong Ma
备注:Extended Abstract, 5th ACM COMPUTER SCIENCE IN CARS SYMPOSIUM (CSCS 2021)
摘要:Cybersecurity has gained importance for cars that increasingly rely on
software and networks. "Smartphone on wheels" is often used as an analogy to
highlight the need for security. As a high-value target of cyberattacks, modern
smartphones implement layers of protection. Automotive embedded systems share
many similarities with smartphones. We compare the security architecture of an
iPhone and a car to identify gaps and discuss the potentials for the cars of
the future.
【6】 Equalized Focal Loss for Dense Long-Tailed Object Detection
标题:用于密集长尾目标检测的均衡焦损算法
作者:Bo Li,Yongqiang Yao,Jingru Tan,Gang Zhang,Fengwei Yu,Jianwei Lu,Ye Luo
摘要:Despite the recent success of long-tailed object detection, almost all
long-tailed object detectors are developed based on the two-stage paradigm. In
practice, one-stage detectors are more prevalent in the industry because they
have a simple and fast pipeline that is easy to deploy. However, in the
long-tailed scenario, this line of work has not been explored so far. In this
paper, we investigate whether one-stage detectors can perform well in this
case. We discover the primary obstacle that prevents one-stage detectors from
achieving excellent performance is: categories suffer from different degrees of
positive-negative imbalance problems under the long-tailed data distribution.
The conventional focal loss balances the training process with the same
modulating factor for all categories, thus failing to handle the long-tailed
problem. To address this issue, we propose the Equalized Focal Loss (EFL) that
rebalances the loss contribution of positive and negative samples of different
categories independently according to their imbalance degrees. Specifically,
EFL adopts a category-relevant modulating factor which can be adjusted
dynamically by the training status of different categories. Extensive
experiments conducted on the challenging LVIS v1 benchmark demonstrate the
effectiveness of our proposed method. With an end-to-end training pipeline, EFL
achieves 29.2% in terms of overall AP and obtains significant performance
improvements on rare categories, surpassing all existing state-of-the-art
methods. The code is available at https://github.com/ModelTC/EOD.
【7】 Leveraging Scale-Invariance and Uncertainity with Self-Supervised Domain Adaptation for Semantic Segmentation of Foggy Scenes
标题:基于尺度不变性和不确定性的自监督领域自适应模糊场景语义分割
作者:Javed Iqbal,Rehan Hafiz,Mohsen Ali
备注:Under Review
摘要:This paper presents FogAdapt, a novel approach for domain adaptation of
semantic segmentation for dense foggy scenes. Although significant research has
been directed to reduce the domain shift in semantic segmentation, adaptation
to scenes with adverse weather conditions remains an open question. Large
variations in the visibility of the scene due to weather conditions, such as
fog, smog, and haze, exacerbate the domain shift, thus making unsupervised
adaptation in such scenarios challenging. We propose a self-entropy and
multi-scale information augmented self-supervised domain adaptation method
(FogAdapt) to minimize the domain shift in foggy scenes segmentation. Supported
by the empirical evidence that an increase in fog density results in high
self-entropy for segmentation probabilities, we introduce a self-entropy based
loss function to guide the adaptation method. Furthermore, inferences obtained
at different image scales are combined and weighted by the uncertainty to
generate scale-invariant pseudo-labels for the target domain. These
scale-invariant pseudo-labels are robust to visibility and scale variations. We
evaluate the proposed model on real clear-weather scenes to real foggy scenes
adaptation and synthetic non-foggy images to real foggy scenes adaptation
scenarios. Our experiments demonstrate that FogAdapt significantly outperforms
the current state-of-the-art in semantic segmentation of foggy images.
Specifically, by considering the standard settings compared to state-of-the-art
(SOTA) methods, FogAdapt gains 3.8% on Foggy Zurich, 6.0% on Foggy
Driving-dense, and 3.6% on Foggy Driving in mIoU when adapted from Cityscapes
to Foggy Zurich.
【8】 Elephant-Human Conflict Mitigation: An Autonomous UAV Approach
标题:缓解大象与人类冲突:一种自主无人机方法
作者:Weiyun Jiang,Yukai Yang,Yogananda Isukapalli
备注:None
摘要:Elephant-human conflict (EHC) is one of the major problems in most African
and Asian countries. As humans overutilize natural resources for their
development, elephants' living area continues to decrease; this leads elephants
to invade the human living area and raid crops more frequently, costing
millions of dollars annually. To mitigate EHC, in this paper, we propose an
original solution that comprises of three parts: a compact custom low-power GPS
tag that is installed on the elephants, a receiver stationed in the human
living area that detects the elephants' presence near a farm, and an autonomous
unmanned aerial vehicle (UAV) system that tracks and herds the elephants away
from the farms. By utilizing proportional-integral-derivative controller and
machine learning algorithms, we obtain accurate tracking trajectories at a
real-time processing speed of 32 FPS. Our proposed autonomous system can save
over 68 % cost compared with human-controlled UAVs in mitigating EHC.
【9】 Multi-Model Federated Learning
标题:多模型联合学习
作者:Neelkamal Bhuyan,Sharayu Moharir
摘要:Federated learning is a form of distributed learning with the key challenge
being the non-identically distributed nature of the data in the participating
clients. In this paper, we extend federated learning to the setting where
multiple unrelated models are trained simultaneously. Specifically, every
client is able to train any one of M models at a time and the server maintains
a model for each of the M models which is typically a suitably averaged version
of the model computed by the clients. We propose multiple policies for
assigning learning tasks to clients over time. In the first policy, we extend
the widely studied FedAvg to multi-model learning by allotting models to
clients in an i.i.d. stochastic manner. In addition, we propose two new
policies for client selection in a multi-model federated setting which make
decisions based on current local losses for each client-model pair. We compare
the performance of the policies on tasks involving synthetic and real-world
data and characterize the performance of the proposed policies. The key
take-away from our work is that the proposed multi-model policies perform
better or at least as good as single model training using FedAvg.
【10】 Prognosis: Closed-Box Analysis of Network Protocol Implementations
标题:预测:网络协议实施的封闭分析
作者:Tiago Ferreira,Harrison Brewton,Loris D'Antoni,Alexandra Silva
备注:None
摘要:We present Prognosis, a framework offering automated closed-box learning and
analysis of models of network protocol implementations. Prognosis can learn
models that vary in abstraction level from simple deterministic automata to
models containing data operations, such as register updates, and can be used to
unlock a variety of analysis techniques -- model checking temporal properties,
computing differences between models of two implementations of the same
protocol, or improving testing via model-based test generation. Prognosis is
modular and easily adaptable to different protocols (e.g., TCP and QUIC) and
their implementations. We use Prognosis to learn models of (parts of) three
QUIC implementations -- Quiche (Cloudflare), Google QUIC, and Facebook mvfst --
and use these models to analyze the differences between the various
implementations. Our analysis provides insights into different design choices
and uncovers potential bugs. Concretely, we have found critical bugs in
multiple QUIC implementations, which have been acknowledged by the developers.
【11】 Charging Techniques for UAV-assisted Data Collection: Is Laser Power Beaming the Answer?
标题:无人机辅助数据收集的充电技术:激光传输能解决问题吗?
作者:Mohamed-Amine Lahmeri,Mustafa A. Kishk,Mohamed-Slim Alouini
备注:6 pages, 5 figures
摘要:As Covid-19 has increased the need for connectivity around the world,
researchers are targeting new technologies that could improve coverage and
connect the unconnected in order to make progress toward the United Nations
Sustainable Development Goals. In this context, drones are seen as one of the
key features of 6G wireless networks that could extend the coverage of previous
wireless network generations. That said, limited on-board energy seems to be
the main drawback that hinders the use of drones for wireless coverage.
Therefore, different wireless and wired charging techniques, such as laser
beaming, charging stations, and tether stations are proposed. In this paper, we
analyze and compare these different charging techniques by performing extensive
simulations for the scenario of drone-assisted data collection from
ground-based Internet of Things (IoT) devices. We analyze the strengths and
weaknesses of each charging technique, and finally show that laser-powered
drones strongly compete with, and outperform in some scenarios other charging
techniques.
【12】 Neural Network Optimization for Reinforcement Learning Tasks Using Sparse Computations
标题:基于稀疏计算的强化学习任务神经网络优化
作者:Dmitry Ivanov,Mikhail Kiselev,Denis Larionov
摘要:This article proposes a sparse computation-based method for optimizing neural
networks for reinforcement learning (RL) tasks. This method combines two ideas:
neural network pruning and taking into account input data correlations; it
makes it possible to update neuron states only when changes in them exceed a
certain threshold. It significantly reduces the number of multiplications when
running neural networks. We tested different RL tasks and achieved 20-150x
reduction in the number of multiplications. There were no substantial
performance losses; sometimes the performance even improved.
【13】 Visual Attention Prediction Improves Performance of Autonomous Drone Racing Agents
标题:视觉注意预测提高自主无人机竞速智能体的性能
作者:Christian Pfeiffer,Simon Wengeler,Antonio Loquercio,Davide Scaramuzza
备注:12 pages, 6 figures
摘要:Humans race drones faster than neural networks trained for end-to-end
autonomous flight. This may be related to the ability of human pilots to select
task-relevant visual information effectively. This work investigates whether
neural networks capable of imitating human eye gaze behavior and attention can
improve neural network performance for the challenging task of vision-based
autonomous drone racing. We hypothesize that gaze-based attention prediction
can be an efficient mechanism for visual information selection and decision
making in a simulator-based drone racing task. We test this hypothesis using
eye gaze and flight trajectory data from 18 human drone pilots to train a
visual attention prediction model. We then use this visual attention prediction
model to train an end-to-end controller for vision-based autonomous drone
racing using imitation learning. We compare the drone racing performance of the
attention-prediction controller to those using raw image inputs and image-based
abstractions (i.e., feature tracks). Our results show that attention-prediction
based controllers outperform the baselines and are able to complete a
challenging race track consistently with up to 88% success rate. Furthermore,
visual attention-prediction and feature-track based models showed better
generalization performance than image-based models when evaluated on hold-out
reference trajectories. Our results demonstrate that human visual attention
prediction improves the performance of autonomous vision-based drone racing
agents and provides an essential step towards vision-based, fast, and agile
autonomous flight that eventually can reach and even exceed human performances.
【14】 Security Considerations for Virtual Reality Systems
标题:虚拟现实系统的安全注意事项
作者:Karthik Viswanathan
摘要:There is a growing need for authentication methodology in virtual reality
applications. Current systems assume that the immersive experience technology
is a collection of peripheral devices connected to a personal computer or
mobile device. Hence there is a complete reliance on the computing device with
traditional authentication mechanisms to handle the authentication and
authorization decisions. Using the virtual reality controllers and headset
poses a different set of challenges as it is subject to unauthorized
observation, unannounced to the user given the fact that the headset completely
covers the field of vision in order to provide an immersive experience. As the
need for virtual reality experiences in the commercial world increases, there
is a need to provide other alternative mechanisms for secure authentication. In
this paper, we analyze a few proposed authentication systems and reached a
conclusion that a multidimensional approach to authentication is needed to
address the granular nature of authentication and authorization needs of a
commercial virtual reality applications in the commercial world.
【15】 A Novel Incremental Learning Driven Instance Segmentation Framework to Recognize Highly Cluttered Instances of the Contraband Items
标题:一种新的增量学习驱动的实例分割框架,用于识别高杂乱的对比带项目实例
作者:Taimur Hassan,Samet Akcay,Mohammed Bennamoun,Salman Khan,Naoufel Werghi
备注:Accepted in IEEE T-SMC: Systems, Source code is available at this https URL
摘要:Screening cluttered and occluded contraband items from baggage X-ray scans is
a cumbersome task even for the expert security staff. This paper presents a
novel strategy that extends a conventional encoder-decoder architecture to
perform instance-aware segmentation and extract merged instances of contraband
items without using any additional sub-network or an object detector. The
encoder-decoder network first performs conventional semantic segmentation and
retrieves cluttered baggage items. The model then incrementally evolves during
training to recognize individual instances using significantly reduced training
batches. To avoid catastrophic forgetting, a novel objective function minimizes
the network loss in each iteration by retaining the previously acquired
knowledge while learning new class representations and resolving their complex
structural inter-dependencies through Bayesian inference. A thorough evaluation
of our framework on two publicly available X-ray datasets shows that it
outperforms state-of-the-art methods, especially within the challenging
cluttered scenarios, while achieving an optimal trade-off between detection
accuracy and efficiency.
【16】 Project IRL: Playful Co-Located Interactions with Mobile Augmented Reality
标题:IRL项目:与移动增强现实进行有趣的协同交互
作者:Ella Dagan,Ana Cárdenas Gasca,Ava Robinson,Anwar Noriega,Yu Jiang Tham,Rajan Vaish,Andrés Monroy-Hernández
摘要:We present Project IRL (In Real Life), a suite of five mobile apps we created
to explore novel ways of supporting in-person social interactions with
augmented reality. In recent years, the tone of public discourse surrounding
digital technology has become increasingly critical, and technology's influence
on the way people relate to each other has been blamed for making people feel
"alone together," diverting their attention from truly engaging with one
another when they interact in person. Motivated by this challenge, we focus on
an under-explored design space: playful co-located interactions. We evaluated
the apps through a deployment study that involved interviews and participant
observations with 101 people. We synthesized the results into a series of
design guidelines that focus on four themes: (1) device arrangement (e.g., are
people sharing one phone, or does each person have their own?), (2) enablers
(e.g., should the activity focus on an object, body part, or pet?), (3)
affordances of modifying reality (i.e., features of the technology that enhance
its potential to encourage various aspects of social interaction), and (4)
co-located play (i.e., using technology to make in-person play engaging and
inviting). We conclude by presenting our design guidelines for future work on
embodied social AR.
【17】 In Situ Data Summaries for Flexible Feature Analysis in Large-Scale Multiphase Flow Simulations
标题:大尺度多相流模拟中柔性特征分析的现场数据汇总
作者:Soumya Dutta,Terece Turton,David Rogers,Jordan Musser,James Ahrens,Ann Almgren
摘要:The study of multiphase flow is essential for understanding the complex
interactions of various materials. In particular, when designing chemical
reactors such as fluidized bed reactors (FBR), a detailed understanding of the
hydrodynamics is critical for optimizing reactor performance and stability. An
FBR allows experts to conduct different types of chemical reactions involving
multiphase materials, especially interaction between gas and solids. During
such complex chemical processes, formation of void regions in the reactor,
generally termed as bubbles, is an important phenomenon. Study of these bubbles
has a deep implication in predicting the reactor's overall efficiency. But
physical experiments needed to understand bubble dynamics are costly and
non-trivial. Therefore, to study such chemical processes and bubble dynamics, a
state-of-the-art massively parallel computational fluid dynamics discrete
element model (CFD-DEM), MFIX-Exa is being developed for simulating multiphase
flows. Despite the proven accuracy of MFIX-Exa in modeling bubbling phenomena,
the very-large size of the output data prohibits the use of traditional post
hoc analysis capabilities in both storage and I/O time. To address these issues
and allow the application scientists to explore the bubble dynamics in an
efficient and timely manner, we have developed an end-to-end visual analytics
pipeline that enables in situ detection of bubbles using statistical
techniques, followed by a flexible and interactive visual exploration of bubble
dynamics in the post hoc analysis phase. Positive feedback from the experts has
indicated the efficacy of the proposed approach for exploring bubble dynamics
in very-large scale multiphase flow simulations.
【18】 On robust risk-based active-learning algorithms for enhanced decision support
标题:增强决策支持的基于风险的鲁棒主动学习算法研究
作者:Aidan J. Hughes,Lawrence A. Bull,Paul Gardner,Nikolaos Dervilis,Keith Worden
备注:48 pages, 39 figures, submitted to Mechanical Systems and Signal Processing
摘要:Classification models are a fundamental component of physical-asset
management technologies such as structural health monitoring (SHM) systems and
digital twins. Previous work introduced \textit{risk-based active learning}, an
online approach for the development of statistical classifiers that takes into
account the decision-support context in which they are applied. Decision-making
is considered by preferentially querying data labels according to
\textit{expected value of perfect information} (EVPI). Although several
benefits are gained by adopting a risk-based active learning approach,
including improved decision-making performance, the algorithms suffer from
issues relating to sampling bias as a result of the guided querying process.
This sampling bias ultimately manifests as a decline in decision-making
performance during the later stages of active learning, which in turn
corresponds to lost resource/utility.
The current paper proposes two novel approaches to counteract the effects of
sampling bias: \textit{semi-supervised learning}, and \textit{discriminative
classification models}. These approaches are first visualised using a synthetic
dataset, then subsequently applied to an experimental case study, specifically,
the Z24 Bridge dataset. The semi-supervised learning approach is shown to have
variable performance; with robustness to sampling bias dependent on the
suitability of the generative distributions selected for the model with respect
to each dataset. In contrast, the discriminative classifiers are shown to have
excellent robustness to the effects of sampling bias. Moreover, it was found
that the number of inspections made during a monitoring campaign, and therefore
resource expenditure, could be reduced with the careful selection of the
statistical classifiers used within a decision-supporting monitoring system.
【19】 Code-Switching Text Augmentation for Multilingual Speech Processing
标题:用于多语言语音处理的码型转换文本增强
作者:Amir Hussein,Shammur Absar Chowdhury,Ahmed Abdelali,Najim Dehak,Ahmed Ali
摘要:The pervasiveness of intra-utterance Code-switching (CS) in spoken content
has enforced ASR systems to handle mixed input. Yet, designing a CS-ASR has
many challenges, mainly due to the data scarcity, grammatical structure
complexity, and mismatch along with unbalanced language usage distribution.
Recent ASR studies showed the predominance of E2E-ASR using multilingual data
to handle CS phenomena with little CS data. However, the dependency on the CS
data still remains. In this work, we propose a methodology to augment the
monolingual data for artificially generating spoken CS text to improve
different speech modules. We based our approach on Equivalence Constraint
theory while exploiting aligned translation pairs, to generate grammatically
valid CS content. Our empirical results show a relative gain of 29-34 % in
perplexity and around 2% in WER for two ecological and noisy CS test sets.
Finally, the human evaluation suggests that 83.8% of the generated data is
acceptable to humans.
【20】 Improving Surrogate Gradient Learning in Spiking Neural Networks via Regularization and Normalization
标题:用正则化和归一化改进尖峰神经网络的代理梯度学习
作者:Nandan Meda
备注:Bachelor Thesis
摘要:Spiking neural networks (SNNs) are different from the classical networks used
in deep learning: the neurons communicate using electrical impulses called
spikes, just like biological neurons. SNNs are appealing for AI technology,
because they could be implemented on low power neuromorphic chips. However,
SNNs generally remain less accurate than their analog counterparts. In this
report, we examine various regularization and normalization techniques with the
goal of improving surrogate gradient learning in SNNs.
【21】 MGAE: Masked Autoencoders for Self-Supervised Learning on Graphs
标题:MGAE:用于图的自监督学习的屏蔽自动编码器
作者:Qiaoyu Tan,Ninghao Liu,Xiao Huang,Rui Chen,Soo-Hyun Choi,Xia Hu
摘要:We introduce a novel masked graph autoencoder (MGAE) framework to perform
effective learning on graph structure data. Taking insights from
self-supervised learning, we randomly mask a large proportion of edges and try
to reconstruct these missing edges during training. MGAE has two core designs.
First, we find that masking a high ratio of the input graph structure, e.g.,
$70\%$, yields a nontrivial and meaningful self-supervisory task that benefits
downstream applications. Second, we employ a graph neural network (GNN) as an
encoder to perform message propagation on the partially-masked graph. To
reconstruct the large number of masked edges, a tailored cross-correlation
decoder is proposed. It could capture the cross-correlation between the head
and tail nodes of anchor edge in multi-granularity. Coupling these two designs
enables MGAE to be trained efficiently and effectively. Extensive experiments
on multiple open datasets (Planetoid and OGB benchmarks) demonstrate that MGAE
generally performs better than state-of-the-art unsupervised learning
competitors on link prediction and node classification.
【22】 NeROIC: Neural Rendering of Objects from Online Image Collections
标题:NeROIC:在线图像集合中对象的神经绘制
作者:Zhengfei Kuang,Kyle Olszewski,Menglei Chai,Zeng Huang,Panos Achlioptas,Sergey Tulyakov
备注:Project page including code can be found at: this https URL
摘要:We present a novel method to acquire object representations from online image
collections, capturing high-quality geometry and material properties of
arbitrary objects from photographs with varying cameras, illumination, and
backgrounds. This enables various object-centric rendering applications such as
novel-view synthesis, relighting, and harmonized background composition from
challenging in-the-wild input. Using a multi-stage approach extending neural
radiance fields, we first infer the surface geometry and refine the coarsely
estimated initial camera parameters, while leveraging coarse foreground object
masks to improve the training efficiency and geometry quality. We also
introduce a robust normal estimation technique which eliminates the effect of
geometric noise while retaining crucial details. Lastly, we extract surface
material properties and ambient illumination, represented in spherical
harmonics with extensions that handle transient elements, e.g. sharp shadows.
The union of these components results in a highly modular and efficient object
acquisition framework. Extensive evaluations and comparisons demonstrate the
advantages of our approach in capturing high-quality geometry and appearance
properties useful for rendering applications.
【23】 Learning Target-aware Representation for Visual Tracking via Informative Interactions
标题:基于信息交互的视觉跟踪学习目标感知表示
作者:Mingzhe Guo,Zhipeng Zhang,Heng Fan,Liping Jing,Yilin Lyu,Bing Li,Weiming Hu
备注:9 pages, 6 figures
摘要:We introduce a novel backbone architecture to improve target-perception
ability of feature representation for tracking. Specifically, having observed
that de facto frameworks perform feature matching simply using the outputs from
backbone for target localization, there is no direct feedback from the matching
module to the backbone network, especially the shallow layers. More concretely,
only the matching module can directly access the target information (in the
reference frame), while the representation learning of candidate frame is blind
to the reference target. As a consequence, the accumulation effect of
target-irrelevant interference in the shallow stages may degrade the feature
quality of deeper layers. In this paper, we approach the problem from a
different angle by conducting multiple branch-wise interactions inside the
Siamese-like backbone networks (InBN). At the core of InBN is a general
interaction modeler (GIM) that injects the prior knowledge of reference image
to different stages of the backbone network, leading to better
target-perception and robust distractor-resistance of candidate feature
representation with negligible computation cost. The proposed GIM module and
InBN mechanism are general and applicable to different backbone types including
CNN and Transformer for improvements, as evidenced by our extensive experiments
on multiple benchmarks. In particular, the CNN version (based on SiamCAR)
improves the baseline with 3.2/6.9 absolute gains of SUC on LaSOT/TNL2K,
respectively. The Transformer version obtains SUC scores of 65.7/52.0 on
LaSOT/TNL2K, which are on par with recent state of the arts. Code and models
will be released.
【24】 RxWhyQA: a clinical question-answering dataset with the challenge of multi-answer questions
标题:RxWhyQA:一个具有多答案问题挑战的临床问答数据集
作者:Sungrim Moon,Huan He,Hongfang Liu,Jungwei W. Fan
备注:2 tables, 3 figures
摘要:Objectives Create a dataset for the development and evaluation of clinical
question-answering (QA) systems that can handle multi-answer questions.
Materials and Methods We leveraged the annotated relations from the 2018
National NLP Clinical Challenges (n2c2) corpus to generate a QA dataset. The
1-to-0 and 1-to-N drug-reason relations formed the unanswerable and
multi-answer entries, which represent challenging scenarios lacking in the
existing clinical QA datasets. Results The result RxWhyQA dataset contains
91,440 QA entries, of which half are unanswerable, and 21% (n=19,269) of the
answerable ones require multiple answers. The dataset conforms to the
community-vetted Stanford Question Answering Dataset (SQuAD) format. Discussion
The RxWhyQA is useful for comparing different systems that need to handle the
zero- and multi-answer challenges, demanding dual mitigation of both false
positive and false negative answers. Conclusion We created and shared a
clinical QA dataset with a focus on multi-answer questions to represent
real-world scenarios.
【25】 The Efficiency of the ANS Entropy Encoding
标题:ANS熵编码的效率分析
作者:Dmitry Kosolobov
备注:15 pages, 5 figures, 2 algorithms
摘要:The Asymmetric Numeral Systems (ANS) is a class of entropy encoders by Duda
that had an immense impact on the data compression, substituting arithmetic and
Huffman coding. The optimality of ANS was studied by Duda et al. but the
precise asymptotic behaviour of its redundancy (in comparison to the entropy)
was not completely understood. In this paper we establish an optimal bound on
the redundancy for the tabled ANS (tANS), the most popular ANS variant. Given a
sequence $a_1,\ldots,a_n$ of letters from an alphabet $\{0,\ldots,\sigma-1\}$
such that each letter $a$ occurs in it $f_a$ times and $n=2^r$, the tANS
encoder using Duda's ``precise initialization'' to fill tANS tables transforms
this sequence into a bit string of length (frequencies are not included in the
encoding size): $$ \sum\limits_{a\in
[0..\sigma)}f_a\cdot\log\frac{n}{f_a}+O(\sigma+r), $$ where $O(\sigma + r)$ can
be bounded by $\sigma\log e+r$. The $r$-bit term is an encoder artifact
indispensable to ANS; the rest incurs a redundancy of $O(\frac{\sigma}{n})$
bits per letter. We complement this bound by a series of examples showing that
an $\Omega(\sigma+r)$ redundancy is necessary when $\sigma > n/3$, where
$\Omega(\sigma + r)$ is at least $\frac{\sigma-1}{4}+r-2$. We argue that
similar examples exist for any methods that distribute letters in tANS tables
using only the knowledge about frequencies. Thus, we refute Duda's conjecture
that the redundancy is $O(\frac{\sigma}{n^2})$ bits per letter.
We also propose a new variant of range ANS (rANS), called rANS with fixed
accuracy, that is parameterized by $k \ge 1$. In this variant the integer
division, which is unavoidable in rANS, is performed only in cases when its
result belongs to $[2^k..2^{k+1})$. Hence, the division can be computed by
faster methods provided $k$ is small. We bound the redundancy for the rANS with
fixed accuracy $k$ by $\frac{n}{2^k-1}\log e+r$.
【26】 Predicting Patient Readmission Risk from Medical Text via Knowledge Graph Enhanced Multiview Graph Convolution
标题:基于知识图增强多视图卷积的医学文本再入院风险预测
作者:Qiuhao Lu,Thien Huu Nguyen,Dejing Dou
备注:SIGIR 2021
摘要:Unplanned intensive care unit (ICU) readmission rate is an important metric
for evaluating the quality of hospital care. Efficient and accurate prediction
of ICU readmission risk can not only help prevent patients from inappropriate
discharge and potential dangers, but also reduce associated costs of
healthcare. In this paper, we propose a new method that uses medical text of
Electronic Health Records (EHRs) for prediction, which provides an alternative
perspective to previous studies that heavily depend on numerical and
time-series features of patients. More specifically, we extract discharge
summaries of patients from their EHRs, and represent them with multiview graphs
enhanced by an external knowledge graph. Graph convolutional networks are then
used for representation learning. Experimental results prove the effectiveness
of our method, yielding state-of-the-art performance for this task.
【27】 Evaluation of Cyber Attacks Targeting Internet Facing IoT : An Experimental Evaluation
标题:面向互联网面向物联网的网络攻击评估:一项实验评估
作者:Navod Neranjan Thilakrathne,Rohan Samarasinghe,Madhuka Priyashan
摘要:The rapid growth of Information and Communication Technology (ICT) in the
21st century has resulted in the emergence of a novel technological paradigm;
known as the Internet of Things, or IoT. The IoT, which is at the heart of
today's smart infrastructure, aids in the creation of a ubiquitous network of
things by simplifying interconnection between smart digital devices and
enabling Machine to Machine (M2M) communication. As of now, there are numerous
examples of IoT use cases available, assisting every person in this world
towards making their lives easier and more convenient. With the latest
advancement of IoT in variety of cyber-attacks that targets these pervasive IoT
environments, which can even lead to jeopardizing the lives of peoples; that
are involving with it. In general, this IoT can be considered as every digital
object that is connected to the Internet for intercommunication. Hence in this
regard in order to analyse cyber threats that come through the Internet, here
we are doing an experimental evaluation to analyse the requests, received to
exploit the opened Secure Shell (SSH) connection service of an IoT device,
which in our case a Raspberry Pi devices, which connected to the Internet for
more than six consecutive days. By opening the SSH service on Raspberry Pi, it
acts as a Honeypot device where we can log and retrieve all login attempt
requests received to the SSH service opened. Inspired by evaluating the IoT
security attacks that target objects in the pervasive IoT environment, after
retrieving all the login requests that made through the open SSH connection we
then provide a comprehensive analysis along with our observations about the
origin of the requests and the focus areas of intruders; in this study.
【28】 Repairing Adversarial Texts through Perturbation
标题:通过扰动修复敌意文本
作者:Guoliang Dong,Jingyi Wang,Jun Sun,Sudipta Chattopadhyay,Xinyu Wang,Ting Dai,Jie Shi,Jin Song Dong
摘要:It is known that neural networks are subject to attacks through adversarial
perturbations, i.e., inputs which are maliciously crafted through perturbations
to induce wrong predictions. Furthermore, such attacks are impossible to
eliminate, i.e., the adversarial perturbation is still possible after applying
mitigation methods such as adversarial training. Multiple approaches have been
developed to detect and reject such adversarial inputs, mostly in the image
domain. Rejecting suspicious inputs however may not be always feasible or
ideal. First, normal inputs may be rejected due to false alarms generated by
the detection algorithm. Second, denial-of-service attacks may be conducted by
feeding such systems with adversarial inputs. To address the gap, in this work,
we propose an approach to automatically repair adversarial texts at runtime.
Given a text which is suspected to be adversarial, we novelly apply multiple
adversarial perturbation methods in a positive way to identify a repair, i.e.,
a slightly mutated but semantically equivalent text that the neural network
correctly classifies. Our approach has been experimented with multiple models
trained for natural language processing tasks and the results show that our
approach is effective, i.e., it successfully repairs about 80\% of the
adversarial texts. Furthermore, depending on the applied perturbation method,
an adversarial text could be repaired in as short as one second on average.
【29】 A Review of Deep Learning Techniques for Markerless Human Motion on Synthetic Datasets
标题:基于合成数据集的无标记人体运动深度学习技术综述
作者:Doan Duy Vo,Russell Butler
备注:11 pages, 5 figures, 2 tables
摘要:Markerless motion capture has become an active field of research in computer
vision in recent years. Its extensive applications are known in a great variety
of fields, including computer animation, human motion analysis, biomedical
research, virtual reality, and sports science. Estimating human posture has
recently gained increasing attention in the computer vision community, but due
to the depth of uncertainty and the lack of the synthetic datasets, it is a
challenging task. Various approaches have recently been proposed to solve this
problem, many of which are based on deep learning. They are primarily focused
on improving the performance of existing benchmarks with significant advances,
especially 2D images. Based on powerful deep learning techniques and recently
collected real-world datasets, we explored a model that can predict the
skeleton of an animation based solely on 2D images. Frames generated from
different real-world datasets with synthesized poses using different body
shapes from simple to complex. The implementation process uses DeepLabCut on
its own dataset to perform many necessary steps, then use the input frames to
train the model. The output is an animated skeleton for human movement. The
composite dataset and other results are the "ground truth" of the deep model.
【30】 Sign Language Video Retrieval with Free-Form Textual Queries
标题:基于自由格式文本查询的手语视频检索
作者:Amanda Duarte,Samuel Albanie,Xavier Giró-i-Nieto,Gül Varol
摘要:Systems that can efficiently search collections of sign language videos have
been highlighted as a useful application of sign language technology. However,
the problem of searching videos beyond individual keywords has received limited
attention in the literature. To address this gap, in this work we introduce the
task of sign language retrieval with free-form textual queries: given a written
query (e.g., a sentence) and a large collection of sign language videos, the
objective is to find the signing video in the collection that best matches the
written query. We propose to tackle this task by learning cross-modal
embeddings on the recently introduced large-scale How2Sign dataset of American
Sign Language (ASL). We identify that a key bottleneck in the performance of
the system is the quality of the sign video embedding which suffers from a
scarcity of labeled training data. We, therefore, propose SPOT-ALIGN, a
framework for interleaving iterative rounds of sign spotting and feature
alignment to expand the scope and scale of available training data. We validate
the effectiveness of SPOT-ALIGN for learning a robust sign video embedding
through improvements in both sign recognition and the proposed video retrieval
task.
【31】 Video Summarization Based on Video-text Representation
标题:基于图文表示的视频摘要
作者:Li Haopeng,Ke Qiuhong,Gong Mingming,Zhang Rui
摘要:Modern video summarization methods are based on deep neural networks which
require a large amount of annotated data for training. However, existing
datasets for video summarization are small-scale, easily leading to
over-fitting of the deep models. Considering that the annotation of large-scale
datasets is time-consuming, we propose a multimodal self-supervised learning
framework to obtain semantic representations of videos, which benefits the
video summarization task. Specifically, we explore the semantic consistency
between the visual information and text information of videos, for the
self-supervised pretraining of a multimodal encoder on a newly-collected
dataset of video-text pairs. Additionally, we introduce a progressive video
summarization method, where the important content in a video is pinpointed
progressively to generate better summaries. Finally, an objective evaluation
framework is proposed to measure the quality of video summaries based on video
classification. Extensive experiments have proved the effectiveness and
superiority of our method in rank correlation coefficients, F-score, and the
proposed objective evaluation compared to the state of the art.
【32】 Audio representations for deep learning in sound synthesis: A review
标题:声音合成中深度学习的音频表征:综述
作者:Anastasia Natsiou,Sean O'Leary
摘要:The rise of deep learning algorithms has led many researchers to withdraw
from using classic signal processing methods for sound generation. Deep
learning models have achieved expressive voice synthesis, realistic sound
textures, and musical notes from virtual instruments. However, the most
suitable deep learning architecture is still under investigation. The choice of
architecture is tightly coupled to the audio representations. A sound's
original waveform can be too dense and rich for deep learning models to deal
with efficiently - and complexity increases training time and computational
cost. Also, it does not represent sound in the manner in which it is perceived.
Therefore, in many cases, the raw audio has been transformed into a compressed
and more meaningful form using upsampling, feature-extraction, or even by
adopting a higher level illustration of the waveform. Furthermore, conditional
on the form chosen, additional conditioning representations, different model
architectures, and numerous metrics for evaluating the reconstructed sound have
been investigated. This paper provides an overview of audio representations
applied to sound synthesis using deep learning. Additionally, it presents the
most significant methods for developing and evaluating a sound synthesis
architecture using deep learning models, always depending on the audio
representation.
【33】 Semantic-based Data Augmentation for Math Word Problems
标题:基于语义的数学应用题数据增强
作者:Ailisi Li,Jiaqing Liang,Yanghua Xiao
摘要:It's hard for neural MWP solvers to deal with tiny local variances. In MWP
task, some local changes conserve the original semantic while the others may
totally change the underlying logic. Currently, existing datasets for MWP task
contain limited samples which are key for neural models to learn to
disambiguate different kinds of local variances in questions and solve the
questions correctly. In this paper, we propose a set of novel data augmentation
approaches to supplement existing datasets with such data that are augmented
with different kinds of local variances, and help to improve the generalization
ability of current neural models. New samples are generated by knowledge guided
entity replacement, and logic guided problem reorganization. The augmentation
approaches are ensured to keep the consistency between the new data and their
labels. Experimental results have shown the necessity and the effectiveness of
our methods.
【34】 Sparse PCA on fixed-rank matrices
标题:固定秩矩阵上的稀疏PCA
作者:Alberto Del Pia
备注:None
摘要:Sparse PCA is the optimization problem obtained from PCA by adding a sparsity
constraint on the principal components. Sparse PCA is NP-hard and hard to
approximate even in the single-component case. In this paper we settle the
computational complexity of sparse PCA with respect to the rank of the
covariance matrix. We show that, if the rank of the covariance matrix is a
fixed value, then there is an algorithm that solves sparse PCA to global
optimality, whose running time is polynomial in the number of features. We also
prove a similar result for the version of sparse PCA which requires the
principal components to have disjoint supports.
【35】 Automated Dissipation Control for Turbulence Simulation with Shell Models
标题:壳模型湍流模拟中的自动耗散控制
作者:Ann-Kathrin Dombrowski,Klaus-Robert Müller,Wolf Christian Müller
摘要:The application of machine learning (ML) techniques, especially neural
networks, has seen tremendous success at processing images and language. This
is because we often lack formal models to understand visual and audio input, so
here neural networks can unfold their abilities as they can model solely from
data. In the field of physics we typically have models that describe natural
processes reasonably well on a formal level. Nonetheless, in recent years, ML
has also proven useful in these realms, be it by speeding up numerical
simulations or by improving accuracy. One important and so far unsolved problem
in classical physics is understanding turbulent fluid motion. In this work we
construct a strongly simplified representation of turbulence by using the
Gledzer-Ohkitani-Yamada (GOY) shell model. With this system we intend to
investigate the potential of ML-supported and physics-constrained small-scale
turbulence modelling. Instead of standard supervised learning we propose an
approach that aims to reconstruct statistical properties of turbulence such as
the self-similar inertial-range scaling, where we could achieve encouraging
experimental results. Furthermore we discuss pitfalls when combining machine
learning with differential equations.
【36】 A sinusoidal signal reconstruction method for the inversion of the mel-spectrogram
标题:一种用于Mel谱图反演的正弦信号重构方法
作者:Anastasia Natsiou,Sean O'Leary
摘要:The synthesis of sound via deep learning methods has recently received much
attention. Some problems for deep learning approaches to sound synthesis relate
to the amount of data needed to specify an audio signal and the necessity of
preserving both the long and short time coherence of the synthesised signal.
Visual time-frequency representations such as the log-mel-spectrogram have
gained in popularity. The log-mel-spectrogram is a perceptually informed
representation of audio that greatly compresses the amount of information
required for the description of the sound. However, because of this
compression, this representation is not directly invertible. Both signal
processing and machine learning techniques have previously been applied to the
inversion of the log-mel-spectrogram but they both caused audible distortions
in the synthesized sounds due to issues of temporal and spectral coherence. In
this paper, we outline the application of a sinusoidal model to the inversion
of the log-mel-spectrogram for pitched musical instrument sounds outperforming
state-of-the-art deep learning methods. The approach could be later used as a
general decoding step from spectral to time intervals in neural applications.
【37】 Bayesian Neural Networks for Reversible Steganography
标题:用于可逆隐写的贝叶斯神经网络
作者:Ching-Chun Chang
摘要:Recent advances in deep learning have led to a paradigm shift in reversible
steganography. A fundamental pillar of reversible steganography is predictive
modelling which can be realised via deep neural networks. However, non-trivial
errors exist in inferences about some out-of-distribution and noisy data. In
view of this issue, we propose to consider uncertainty in predictive models
based upon a theoretical framework of Bayesian deep learning. Bayesian neural
networks can be regarded as self-aware machinery; that is, a machine that knows
its own limitations. To quantify uncertainty, we approximate the posterior
predictive distribution through Monte Carlo sampling with stochastic forward
passes. We further show that predictive uncertainty can be disentangled into
aleatoric and epistemic uncertainties and these quantities can be learnt in an
unsupervised manner. Experimental results demonstrate an improvement delivered
by Bayesian uncertainty analysis upon steganographic capacity-distortion
performance.
【38】 Modeling International Mobility using Roaming Cell Phone Traces during COVID-19 Pandemic
标题:利用冠状病毒大流行期间漫游手机痕迹模拟国际流动性
作者:Massimiliano Luca,Bruno Lepri,Enrique Frias-Martinez,Andra Lutu
摘要:Most of the studies related to human mobility are focused on intra-country
mobility. However, there are many scenarios (e.g., spreading diseases,
migration) in which timely data on international commuters are vital. Mobile
phones represent a unique opportunity to monitor international mobility flows
in a timely manner and with proper spatial aggregation. This work proposes
using roaming data generated by mobile phones to model incoming and outgoing
international mobility. We use the gravity and radiation models to capture
mobility flows before and during the introduction of non-pharmaceutical
interventions. However, traditional models have some limitations: for instance,
mobility restrictions are not explicitly captured and may play a crucial role.
To overtake such limitations, we propose the COVID Gravity Model (CGM), namely
an extension of the traditional gravity model that is tailored for the pandemic
scenario. This proposed approach overtakes, in terms of accuracy, the
traditional models by 126.9% for incoming mobility and by 63.9% when modeling
outgoing mobility flows.
【39】 Similarities and Differences between Machine Learning and Traditional Advanced Statistical Modeling in Healthcare Analytics
标题:医疗分析中机器学习与传统高级统计建模的异同
作者:Michele Bennett,Karin Hayes,Ewa J. Kleczyk,Rajesh Mehta
备注:16 pages, 2 figures
摘要:Data scientists and statisticians are often at odds when determining the best
approach, machine learning or statistical modeling, to solve an analytics
challenge. However, machine learning and statistical modeling are more cousins
than adversaries on different sides of an analysis battleground. Choosing
between the two approaches or in some cases using both is based on the problem
to be solved and outcomes required as well as the data available for use and
circumstances of the analysis. Machine learning and statistical modeling are
complementary, based on similar mathematical principles, but simply using
different tools in an overall analytics knowledge base. Determining the
predominant approach should be based on the problem to be solved as well as
empirical evidence, such as size and completeness of the data, number of
variables, assumptions or lack thereof, and expected outcomes such as
predictions or causality. Good analysts and data scientists should be well
versed in both techniques and their proper application, thereby using the right
tool for the right project to achieve the desired results.
【40】 On The Decoding Error Weight of One or Two Deletion Channels
标题:关于一个或两个删除信道的译码误码权重
作者:Omer Sabary,Daniella Bar-Lev,Yotam Gershon,Alexander Yucovich,Eitan Yaakobi
备注:arXiv admin note: text overlap with arXiv:2001.05582
摘要:This paper tackles two problems that are relevant to coding for insertions
and deletions. These problems are motivated by several applications, among them
is reconstructing strands in DNA-based storage systems. Under this paradigm, a
word is transmitted over some fixed number of identical independent channels
and the goal of the decoder is to output the transmitted word or some close
approximation of it. The first part of this paper studies the deletion channel
that deletes a symbol with some fixed probability $p$, while focusing on two
instances of this channel. Since operating the maximum likelihood (ML) decoder
in this case is computationally unfeasible, we study a slightly degraded
version of this decoder for two channels and its expected normalized distance.
We identify the dominant error patterns and based on these observations, it is
derived that the expected normalized distance of the degraded ML decoder is
roughly $\frac{3q-1}{q-1}p^2$, when the transmitted word is any $q$-ary
sequence and $p$ is the channel's deletion probability. We also study the cases
when the transmitted word belongs to the Varshamov Tenengolts (VT) code or the
shifted VT code. Additionally, the insertion channel is studied as well as the
case of two insertion channels. These theoretical results are verified by
corresponding simulations. The second part of the paper studies optimal
decoding for a special case of the deletion channel, the $k$-deletion channel,
which deletes exactly $k$ symbols of the transmitted word uniformly at random.
In this part, the goal is to understand how an optimal decoder operates in
order to minimize the expected normalized distance. A full characterization of
an efficient optimal decoder for this setup, referred to as the maximum
likelihood* (ML*) decoder, is given for a channel that deletes one or two
symbols.
【41】 Churn prediction in online gambling
标题:在线赌博中的流失预测
作者:Florian Merchie,Damien Ernst
备注:14 pages, 3 figures Submitted to Expert Systems with Applications
摘要:In business retention, churn prevention has always been a major concern. This
work contributes to this domain by formalizing the problem of churn prediction
in the context of online gambling as a binary classification task. We also
propose an algorithmic answer to this problem based on recurrent neural
network. This algorithm is tested with online gambling data that have the form
of time series, which can be efficiently processed by recurrent neural
networks. To evaluate the performances of the trained models, standard machine
learning metrics were used, such as accuracy, precision and recall. For this
problem in particular, the conducted experiments allowed to assess that the
choice of a specific architecture depends on the metric which is given the
greatest importance. Architectures using nBRC favour precision, those using
LSTM give better recall, while GRU-based architectures allow a higher accuracy
and balance two other metrics. Moreover, further experiments showed that using
only the more recent time-series histories to train the networks decreases the
quality of the results. We also study the performances of models learned at a
specific instant $t$, at other times $t^{\prime} > t$. The results show that
the performances of the models learned at time $t$ remain good at the following
instants $t^{\prime} > t$, suggesting that there is no need to refresh the
models at a high rate. However, the performances of the models were subject to
noticeable variance due to one-off events impacting the data.
【42】 A SIMD algorithm for the detection of epistatic interactions of any order
标题:检测任意阶上位性相互作用的SIMD算法
作者:Christian Ponte-Fernández,Jorge González-Domínguez,María J. Martín
备注:Submitted to Future Generation Computer Systems. Codes used are available at this https URL
摘要:Epistasis is a phenomenon in which a phenotype outcome is determined by the
interaction of genetic variation at two or more loci and it cannot be
attributed to the additive combination of effects corresponding to the
individual loci. Although it has been more than 100 years since William Bateson
introduced this concept, it still is a topic under active research. Locating
epistatic interactions is a computationally expensive challenge that involves
analyzing an exponentially growing number of combinations. Authors in this
field have resorted to a multitude of hardware architectures in order to speed
up the search, but little to no attention has been paid to the vector
instructions that current CPUs include in their instruction sets. This work
extends an existing third-order exhaustive algorithm to support the search of
epistasis interactions of any order and discusses multiple SIMD implementations
of the different functions that compose the search using Intel AVX Intrinsics.
Results using the GCC and the Intel compiler show that the 512-bit explicit
vector implementation proposed here performs the best out of all of the other
implementations evaluated. The proposed 512-bit vectorization accelerates the
original implementation of the algorithm by an average factor of 7 and 12, for
GCC and the Intel Compiler, respectively, in the scenarios tested.
【43】 Deep Learnable Strategy Templates for Multi-Issue Bilateral Negotiation
标题:多议题双边谈判的深度学习策略模板
作者:Pallavi Bagga,Nicola Paoletti,Kostas Stathis
备注:arXiv admin note: text overlap with arXiv:2009.08302
摘要:We study how to exploit the notion of strategy templates to learn strategies
for multi-issue bilateral negotiation. Each strategy template consists of a set
of interpretable parameterized tactics that are used to decide an optimal
action at any time. We use deep reinforcement learning throughout an
actor-critic architecture to estimate the tactic parameter values for a
threshold utility, when to accept an offer and how to generate a new bid. This
contrasts with existing work that only estimates the threshold utility for
those tactics. We pre-train the strategy by supervision from the dataset
collected using "teacher strategies", thereby decreasing the exploration time
required for learning during negotiation. As a result, we build automated
agents for multi-issue negotiations that can adapt to different negotiation
domains without the need to be pre-programmed. We empirically show that our
work outperforms the state-of-the-art in terms of the individual as well as
social efficiency.
【44】 Analytical calculation formulas for capacities of classical and classical-quantum channels
标题:经典信道和经典量子信道容量的解析计算公式
作者:Masahito Hayashi
摘要:We derive an analytical calculation formula for the channel capacity of a
classical channel without any iteration while its existing algorithms require
iterations and the number of iteration depends on the required precision level.
Hence, our formula is its first analytical formula without any iteration. We
apply the obtained formula to examples and see how the obtained formula works
in these examples. Then, we extend it to the channel capacity of a
classical-quantum (cq-) channel. Many existing studies proposed algorithms for
a cq-channel and all of them require iterations. Our extended analytical
algorithm have also no iteration and output the exactly optimum values.
【45】 Online 3-Axis Magnetometer Hard-Iron and Soft-Iron Bias and Angular Velocity Sensor Bias Estimation Using Angular Velocity Sensors for Improved Dynamic Heading Accuracy
标题:在线三轴磁强计硬铁和软铁偏差和角速度传感器偏差估计使用角速度传感器提高动态航向精度
作者:Andrew R. Spielvogel,Abhimanyu S. Shah,Louis L. Whitcomb
备注:Preprint of an article accepted for publication in Field Robotics, this https URL, Special Issue in Unmanned Marine Systems. Submitted January 16, 2021; Revised May 28, 2021; Accepted August 2, 2021
摘要:This article addresses the problem of dynamic on-line estimation and
compensation of hard-iron and soft-iron biases of 3-axis magnetometers under
dynamic motion in field robotics, utilizing only biased measurements from a
3-axis magnetometer and a 3-axis angular rate sensor. The proposed magnetometer
and angular velocity bias estimator (MAVBE) utilizes a 15-state process model
encoding the nonlinear process dynamics for the magnetometer signal subject to
angular velocity excursions, while simultaneously estimating 9 magnetometer
bias parameters and 3 angular rate sensor bias parameters, within an extended
Kalman filter framework. Bias parameter local observability is numerically
evaluated. The bias-compensated signals, together with 3-axis accelerometer
signals, are utilized to estimate bias compensated magnetic geodetic heading.
Performance of the proposed MAVBE method is evaluated in comparison to the
widely cited magnetometer-only TWOSTEP method in numerical simulations,
laboratory experiments, and full-scale field trials of an instrumented
autonomous underwater vehicle in the Chesapeake Bay, MD, USA. For the proposed
MAVBE, (i) instrument attitude is not required to estimate biases, and the
results show that (ii) the biases are locally observable, (iii) the bias
estimates converge rapidly to true bias parameters, (iv) only modest instrument
excitation is required for bias estimate convergence, and (v) compensation for
magnetometer hard-iron and soft-iron biases dramatically improves dynamic
heading estimation accuracy.
【46】 k-Center Clustering with Outliers in Sliding Windows
标题:滑动窗口中带离群点的K-中心聚类
作者:Paolo Pellizzoni,Andrea Pietracaprina,Geppino Pucci
摘要:Metric $k$-center clustering is a fundamental unsupervised learning
primitive. Although widely used, this primitive is heavily affected by noise in
the data, so that a more sensible variant seeks for the best solution that
disregards a given number $z$ of points of the dataset, called outliers. We
provide efficient algorithms for this important variant in the streaming model
under the sliding window setting, where, at each time step, the dataset to be
clustered is the window $W$ of the most recent data items. Our algorithms
achieve $O(1)$ approximation and, remarkably, require a working memory linear
in $k+z$ and only logarithmic in $|W|$. As a by-product, we show how to
estimate the effective diameter of the window $W$, which is a measure of the
spread of the window points, disregarding a given fraction of noisy distances.
We also provide experimental evidence of the practical viability of our
theoretical results.
【47】 Bregman divergence based em algorithm and its application to classical and quantum rate distortion theory
标题:基于Bregman散度的em算法及其在经典和量子率失真理论中的应用
作者:Masahito Hayashi
摘要:We formulate em algorithm in the framework of Bregman divergence, which is a
general problem setting of information geometry. That is, we address the
minimization problem of the Bregman divergence between an exponential subfamily
and a mixture subfamily in a Bregman divergence system. Then, we show the
convergence and its speed under several conditions. We apply this algorithm to
rate distortion and its variants including the quantum setting, and show the
usefulness of our general algorithm.
【48】 Continuous-time Radar-inertial Odometry for Automotive Radars
标题:汽车雷达的连续时间雷达惯性里程计
作者:Yin Zhi Ng,Benjamin Choi,Robby Tan,Lionel Heng
备注:In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
摘要:We present an approach for radar-inertial odometry which uses a
continuous-time framework to fuse measurements from multiple automotive radars
and an inertial measurement unit (IMU). Adverse weather conditions do not have
a significant impact on the operating performance of radar sensors unlike that
of camera and LiDAR sensors. Radar's robustness in such conditions and the
increasing prevalence of radars on passenger vehicles motivate us to look at
the use of radar for ego-motion estimation. A continuous-time trajectory
representation is applied not only as a framework to enable heterogeneous and
asynchronous multi-sensor fusion, but also, to facilitate efficient
optimization by being able to compute poses and their derivatives in
closed-form and at any given time along the trajectory. We compare our
continuous-time estimates to those from a discrete-time radar-inertial odometry
approach and show that our continuous-time method outperforms the discrete-time
method. To the best of our knowledge, this is the first time a continuous-time
framework has been applied to radar-inertial odometry.
【49】 Spatial-Temporal Sequential Hypergraph Network for Crime Prediction
标题:时空序列超图网络在犯罪预测中的应用
作者:Lianghao Xia,Chao Huang,Yong Xu,Peng Dai,Liefeng Bo,Xiyue Zhang,Tianyi Chen
备注:IJCAI 2021 Research Paper
摘要:Crime prediction is crucial for public safety and resource optimization, yet
is very challenging due to two aspects: i) the dynamics of criminal patterns
across time and space, crime events are distributed unevenly on both spatial
and temporal domains; ii) time-evolving dependencies between different types of
crimes (e.g., Theft, Robbery, Assault, Damage) which reveal fine-grained
semantics of crimes. To tackle these challenges, we propose Spatial-Temporal
Sequential Hypergraph Network (ST-SHN) to collectively encode complex crime
spatial-temporal patterns as well as the underlying category-wise crime
semantic relationships. In specific, to handle spatial-temporal dynamics under
the long-range and global context, we design a graph-structured message passing
architecture with the integration of the hypergraph learning paradigm. To
capture category-wise crime heterogeneous relations in a dynamic environment,
we introduce a multi-channel routing mechanism to learn the time-evolving
structural dependency across crime types. We conduct extensive experiments on
two real-world datasets, showing that our proposed ST-SHN framework can
significantly improve the prediction performance as compared to various
state-of-the-art baselines. The source code is available at:
https://github.com/akaxlh/ST-SHN.
【50】 Forecasting emissions through Kaya identity using Neural Ordinary Differential Equations
标题:基于神经元常微分方程的Kaya恒等式排放量预测
作者:Pierre Browne,Aranildo Lima,Rossella Arcucci,César Quilodrán-Casas
备注:5 pages, 2 figures, Tackling Climate Change with Machine Learning workshop at ICML 2021
摘要:Starting from the Kaya identity, we used a Neural ODE model to predict the
evolution of several indicators related to carbon emissions, on a
country-level: population, GDP per capita, energy intensity of GDP, carbon
intensity of energy. We compared the model with a baseline statistical model -
VAR - and obtained good performances. We conclude that this machine-learning
approach can be used to produce a wide range of results and give relevant
insight to policymakers
【51】 Automatic Speech Recognition Datasets in Cantonese Language: A Survey and a New Dataset
标题:粤语自动语音识别数据集:综述和一个新的数据集
作者:Tiezheng Yu,Rita Frieske,Peng Xu,Samuel Cahyawijaya,Cheuk Tung Shadow Yiu,Holy Lovenia,Wenliang Dai,Elham J. Barezi,Qifeng Chen,Xiaojuan Ma,Bertram E. Shi,Pascale Fung
摘要:Automatic speech recognition (ASR) on low resource languages improves access
of linguistic minorities to technological advantages provided by Artificial
Intelligence (AI). In this paper, we address a problem of data scarcity of Hong
Kong Cantonese language by creating a new Cantonese dataset. Our dataset,
Multi-Domain Cantonese Corpus (MDCC), consists of 73.6 hours of clean read
speech paired with transcripts, collected from Cantonese audiobooks from Hong
Kong. It combines philosophy, politics, education, culture, lifestyle and
family domains, covering a wide range of topics. We also review all existing
Cantonese datasets and perform experiments on the two biggest datasets (MDCC
and Common Voice zh-HK). We analyze the existing datasets according to their
speech type, data source, total size and availability. The results of
experiments conducted with Fairseq S2T Transformer, a state-of-the-art ASR
model, show the effectiveness of our dataset. In addition, we create a powerful
and robust Cantonese ASR model by applying multi-dataset learning on MDCC and
Common Voice zh-HK.
【52】 Developing Assistive Technology to Support Reminiscence Therapy: A User-Centered Study to Identify Caregivers' Needs
标题:开发辅助技术支持记忆治疗:一项以用户为中心的研究,以确定照顾者的需求
作者:Soraia M. Alarcão,André Santana,Carolina Maruta,Manuel J. Fonseca
备注:27 pages, 2 figures, Manuscript submitted to the the Special Issue on Advances in Human-Centred Dementia Technology of the International Journal of Human-Computer Studies
摘要:Reminiscence therapy is an inexpensive non-pharmacological therapy commonly
used due to its therapeutic value for PwD, as it can be used to promote
independence, positive moods and behavior, and improve their quality of life.
Caregivers are one of the main pillars in the adoption of digital technologies
for reminiscence therapy, as they are responsible for its administration.
Despite their comprehensive understanding of the needs and difficulties
associated with the therapy, their perspective has not been fully taken into
account in the development of existing technological solutions. To inform the
design of technological solutions within dementia care, we followed a
user-centered design approach through worldwide surveys, follow-up
semi-structured interviews, and focus groups. Seven hundred and seven informal
and 52 formal caregivers participated in our study. Our findings show that
technological solutions must provide mechanisms to carry out the therapy in a
simple way, reducing the amount of work for caregivers when preparing and
conducting therapy sessions. They should also diversify and personalize the
current session (and following ones) based on both the biographical information
of the PwD and their emotional reactions. This is particularly important since
the PwD often become agitated, aggressive or angry, and caregivers might not
know how to properly deal with this situation (in particular, the informal
ones). Additionally, formal caregivers need an easy way to manage information
of the different PwD they take care of, and consult the history of sessions
performed (in particular, to identify images that triggered negative emotional
reactions, and consult any notes taken about them). As a result, we present a
list of validated functional requirements gathered for the PwD and both formal
and informal caregivers, as well as the corresponding expected primary and
secondary outcomes.
【53】 Auction-Based Ex-Post-Payment Incentive Mechanism Design for Horizontal Federated Learning with Reputation and Contribution Measurement
标题:基于拍卖的带声誉和贡献度的横向联合学习支付后激励机制设计
作者:Jingwen Zhang,Yuezhou Wu,Rong Pan
摘要:Federated learning trains models across devices with distributed data, while
protecting the privacy and obtaining a model similar to that of centralized ML.
A large number of workers with data and computing power are the foundation of
federal learning. However, the inevitable costs prevent self-interested workers
from serving for free. Moreover, due to data isolation, task publishers lack
effective methods to select, evaluate and pay reliable workers with
high-quality data. Therefore, we design an auction-based incentive mechanism
for horizontal federated learning with reputation and contribution measurement.
By designing a reasonable method of measuring contribution, we establish the
reputation of workers, which is easy to decline and difficult to improve.
Through reverse auctions, workers bid for tasks, and the task publisher selects
workers combining reputation and bid price. With the budget constraint, winning
workers are paid based on performance. We proved that our mechanism satisfies
the individual rationality of the honest worker, budget feasibility,
truthfulness, and computational efficiency.
【54】 Tight Fine-Grained Bounds for Direct Access on Join Queries
标题:连接查询直接访问的紧致细粒度界限
作者:Karl Bringmann,Nofar Carmeli,Stefan Mengel
摘要:We consider the task of lexicographic direct access to query answers. That
is, we want to simulate an array containing the answers of a join query sorted
in a lexicographic order chosen by the user. A recent dichotomy showed for
which queries and orders this task can be done in polylogarithmic access time
after quasilinear preprocessing, but this dichotomy does not tell us how much
time is required in the cases classified as hard. We determine the
preprocessing time needed to achieve polylogarithmic access time for all
self-join free queries and all lexicographical orders. To this end, we propose
a decomposition-based general algorithm for direct access on join queries. We
then explore its optimality by proving lower bounds for the preprocessing time
based on the hardness of a certain online Set-Disjointness problem, which shows
that our algorithm's bounds are tight for all lexicographic orders on self-join
free queries. Then, we prove the hardness of Set-Disjointness based on the
Zero-Clique Conjecture which is an established conjecture from fine-grained
complexity theory. We also show that similar techniques can be used to prove
that, for enumerating answers to Loomis-Whitney joins, it is not possible to
significantly improve upon trivially computing all answers at preprocessing.
This, in turn, gives further evidence (based on the Zero-Clique Conjecture) to
the enumeration hardness of self-join free cyclic joins with respect to linear
preprocessing and constant delay.
【55】 Neural calibration of hidden inhomogeneous Markov chains -- Information decompression in life insurance
标题:隐含非齐次马氏链的神经标定--人寿保险中的信息解压缩
作者:Mark Kiermayer,Christian Weiß
摘要:Markov chains play a key role in a vast number of areas, including life
insurance mathematics. Standard actuarial quantities as the premium value can
be interpreted as compressed, lossy information about the underlying Markov
process. We introduce a method to reconstruct the underlying Markov chain given
collective information of a portfolio of contracts. Our neural architecture
explainably characterizes the process by explicitly providing one-step
transition probabilities. Further, we provide an intrinsic, economic model
validation to inspect the quality of the information decompression. Lastly, our
methodology is successfully tested for a realistic data set of German term life
insurance contracts.
【56】 Detecting Human-to-Human-or-Object (H2O) Interactions with DIABOLO
标题:使用空竹检测人与人或物(H2O)的交互
作者:Astrid Orcesi,Romaric Audigier,Fritz Poka Toukam,Bertrand Luvison
备注:ACCEPTED in IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021)
摘要:Detecting human interactions is crucial for human behavior analysis. Many
methods have been proposed to deal with Human-to-Object Interaction (HOI)
detection, i.e., detecting in an image which person and object interact
together and classifying the type of interaction. However, Human-to-Human
Interactions, such as social and violent interactions, are generally not
considered in available HOI training datasets. As we think these types of
interactions cannot be ignored and decorrelated from HOI when analyzing human
behavior, we propose a new interaction dataset to deal with both types of human
interactions: Human-to-Human-or-Object (H2O). In addition, we introduce a novel
taxonomy of verbs, intended to be closer to a description of human body
attitude in relation to the surrounding targets of interaction, and more
independent of the environment. Unlike some existing datasets, we strive to
avoid defining synonymous verbs when their use highly depends on the target
type or requires a high level of semantic interpretation. As H2O dataset
includes V-COCO images annotated with this new taxonomy, images obviously
contain more interactions. This can be an issue for HOI detection methods whose
complexity depends on the number of people, targets or interactions. Thus, we
propose DIABOLO (Detecting InterActions By Only Looking Once), an efficient
subject-centric single-shot method to detect all interactions in one forward
pass, with constant inference time independent of image content. In addition,
this multi-task network simultaneously detects all people and objects. We show
how sharing a network for these tasks does not only save computation resource
but also improves performance collaboratively. Finally, DIABOLO is a strong
baseline for the new proposed challenge of H2O Interaction detection, as it
outperforms all state-of-the-art methods when trained and evaluated on HOI
dataset V-COCO.
【57】 InRS: implementing the indicator function of NURBS-shaped planar domains
作者:Alvise Sommarivaa,Marco Vianello
摘要:We provide an algorithm that implements the indicator function of
NURBS-shaped planar domains, tailored to the fast computation on huge point
clouds, together with the corresponding Matlab code.
【58】 Unwinding Rotations Improves User Comfort with Immersive Telepresence Robots
标题:使用身临其境的网真机器人,展开旋转可提高用户舒适度
作者:Markku Suomalainen,Basak Sakcak,Adhi Widagdo,Juho Kalliokoski,Katherine J. Mimnaugh,Alexis P. Chambers,Timo Ojala,Steven M. LaValle
备注:Accepted for publication in HRI (Int. Conf. on Human-Robot Interaction) 2022
摘要:We propose unwinding the rotations experienced by the user of an immersive
telepresence robot to improve comfort and reduce VR sickness of the user. By
immersive telepresence we refer to a situation where a 360\textdegree~camera on
top of a mobile robot is streaming video and audio into a head-mounted display
worn by a remote user possibly far away. Thus, it enables the user to be
present at the robot's location, look around by turning the head and
communicate with people near the robot. By unwinding the rotations of the
camera frame, the user's viewpoint is not changed when the robot rotates. The
user can change her viewpoint only by physically rotating in her local setting;
as visual rotation without the corresponding vestibular stimulation is a major
source of VR sickness, physical rotation by the user is expected to reduce VR
sickness. We implemented unwinding the rotations for a simulated robot
traversing a virtual environment and ran a user study (N=34) comparing
unwinding rotations to user's viewpoint turning when the robot turns. Our
results show that the users found unwound rotations more preferable and
comfortable and that it reduced their level of VR sickness. We also present
further results about the users' path integration capabilities, viewing
directions, and subjective observations of the robot's speed and distances to
simulated people and objects.
【59】 Methods for Increasing the Resistance of Cryptographic Designs against Horizontal DPA Attacks
标题:提高密码设计抵抗水平DPA攻击的方法
作者:Ievgen Kabin,Zoya Dyka,Dan Kreiser,Peter Langendoerfer
备注:Author's version accepted for ICICS-2017; the final publication is available at Springer via this https URL
摘要:Side-channel analysis attacks, especially horizontal DPA and DEMA attacks,
are significant threats for cryptographic designs. In this paper we investigate
to which extend different multiplication formulae and randomization of the
field multiplier increase the resistance of an ECC design against horizontal
attacks. We implemented a randomized sequence of the calculation of partial
products for the field multiplication in order to increase the security
features of the field multiplier. Additionally, we use the partial polynomial
multiplier itself as a kind of countermeasure against DPA attacks. We
demonstrate that the implemented classical multiplication formula can increase
the inherent resistance of the whole ECC design. We also investigate the impact
of the combination of these two approaches. For the evaluation we synthesized
all these designs for a 250 nm gate library technologies, and analysed the
simulated power traces. All investigated protection means help to decrease the
success rate of attacks significantly: the correctness of the revealed key was
decreased from 99% to 69%.
【60】 The Defeat of the Winograd Schema Challenge
标题:Winograd Schema挑战赛的失败
作者:Vid Kocijan,Ernest Davis,Thomas Lukasiewicz,Gary Marcus,Leora Morgenstern
摘要:The Winograd Schema Challenge -- a set of twin sentences involving pronoun
reference disambiguation that seem to require the use of commonsense knowledge
-- was proposed by Hector Levesque in 2011. By 2019, a number of AI systems,
based on large pre-trained transformer-based language models and fine-tuned on
these kinds of problems, achieved better than 90% accuracy. In this paper, we
review the history of the Winograd Schema Challenge and assess its
significance.
【61】 Offline Reinforcement Learning for Road Traffic Control
标题:用于道路交通控制的离线强化学习
作者:Mayuresh Kunjir,Sanjay Chawla
备注:8 pages
摘要:Traffic signal control is an important problem in urban mobility with a
significant potential of economic and environmental impact. While there is a
growing interest in Reinforcement Learning (RL) for traffic control, the work
so far has focussed on learning through interactions which, in practice, is
costly. Instead, real experience data on traffic is available and could be
exploited at minimal costs. Recent progress in offline or batch RL has enabled
just that. Model-based offline RL methods, in particular, have been shown to
generalize to the experience data much better than others. We build a
model-based learning framework, A-DAC, which infers a Markov Decision Process
(MDP) from dataset with pessimistic costs built in to deal with data
uncertainties. The costs are modeled through an adaptive shaping of rewards in
the MDP which provides better regularization of data compared to the prior
related work. A-DAC is evaluated on a complex signalized roundabout using
multiple datasets varying in size and in batch collection policy. The
evaluation results show that it is possible to build high performance control
policies in a data efficient manner using simplistic batch collection policies.
【62】 As-Continuous-As-Possible Ceramics Printing for Shell Models
标题:贝壳模型的尽可能连续的陶瓷印刷
作者:Fanchao Zhong,Yonglai Xu,Haisen Zhao,Lin Lu
备注:15 pages, 21 figures
摘要:We propose a novel computational framework for fabricating thin shell models
on an extrusion-based Cartesian 3D printer with the clay material.
Extrusion-based ceramics printing involves several inevitable challenges to
achieve acceptable print quality, including continuous toolpath with the
minimal number of transfer moves, separation of non-model and model structures,
etc. Inertia of the extruded material may damage the surface quality during
transfer moves. The viscosity also makes support material hard to remove. These
challenges even increase for thin shell surfaces, as both sides are of visual
significance, making it impossible to hide any intermediate structures in the
interiors. To conquer these challenges, we adopt a curved layer scheme for
ceramics printing. Then we introduce an original criterion "one-path patch"
(OPP), for representing a shell surface patch that can be traversed in one path
in the context of curved layer printing considering fabrication constraints. We
propose a bottom-up OPP merging procedure for decomposing the given shell
surface into a minimal number of OPPs and generating the
"as-continuous-as-possible" (ACAP) toolpath. Furthermore, we customize the path
planning algorithm with a decoupled orientation and support structures
computation method. Results demonstrate that our ACAP algorithm prints shell
models with both efficiency and surface quality.
【63】 Mirror Learning: A Unifying Framework of Policy Optimisation
标题:镜像学习:政策优化的统一框架
作者:Jakub Grudzien Kuba,Christian Schroeder de Witt,Jakob Foerster
摘要:General policy improvement (GPI) and trust-region learning (TRL) are the
predominant frameworks within contemporary reinforcement learning (RL), which
serve as the core models for solving Markov decision processes (MDPs).
Unfortunately, in their mathematical form, they are sensitive to modifications,
and thus, the practical instantiations that implement them do not automatically
inherit their improvement guarantees. As a result, the spectrum of available
rigorous MDP-solvers is narrow. Indeed, many state-of-the-art (SOTA)
algorithms, such as TRPO and PPO, are not proven to converge. In this paper, we
propose \textsl{mirror learning} -- a general solution to the RL problem. We
reveal GPI and TRL to be but small points within this far greater space of
algorithms which boasts the monotonic improvement property and converges to the
optimal policy. We show that virtually all SOTA algorithms for RL are instances
of mirror learning, and thus suggest that their empirical performance is a
consequence of their theoretical properties, rather than of approximate
analogies. Excitingly, we show that mirror learning opens up a whole new space
of policy learning methods with convergence guarantees.
【64】 Deep Generative Framework for Interactive 3D Terrain Authoring and Manipulation
标题:交互式三维地形创作和操纵的深度生成框架
作者:Shanthika Naik,Aryamaan Jain,Avinash Sharma,KS Rajan
摘要:Automated generation and (user) authoring of the realistic virtual terrain is
most sought for by the multimedia applications like VR models and gaming. The
most common representation adopted for terrain is Digital Elevation Model
(DEM). Existing terrain authoring and modeling techniques have addressed some
of these and can be broadly categorized as: procedural modeling, simulation
method, and example-based methods. In this paper, we propose a novel realistic
terrain authoring framework powered by a combination of VAE and generative
conditional GAN model. Our framework is an example-based method that attempts
to overcome the limitations of existing methods by learning a latent space from
a real-world terrain dataset. This latent space allows us to generate multiple
variants of terrain from a single input as well as interpolate between terrains
while keeping the generated terrains close to real-world data distribution. We
also developed an interactive tool, that lets the user generate diverse
terrains with minimalist inputs. We perform thorough qualitative and
quantitative analysis and provide comparisons with other SOTA methods. We
intend to release our code/tool to the academic community.
【65】 Uncertainty-Aware Cascaded Dilation Filtering for High-Efficiency Deraining
标题:基于不确定性感知的级联膨胀滤波高效去噪
作者:Qing Guo,Jingyang Sun,Felix Juefei-Xu,Lei Ma,Di Lin,Wei Feng,Song Wang
备注:14 pages, 10 figures, 10 tables. This is the extention of our conference version this https URL
摘要:Deraining is a significant and fundamental computer vision task, aiming to
remove the rain streaks and accumulations in an image or video captured under a
rainy day. Existing deraining methods usually make heuristic assumptions of the
rain model, which compels them to employ complex optimization or iterative
refinement for high recovery quality. This, however, leads to time-consuming
methods and affects the effectiveness for addressing rain patterns deviated
from from the assumptions. In this paper, we propose a simple yet efficient
deraining method by formulating deraining as a predictive filtering problem
without complex rain model assumptions. Specifically, we identify
spatially-variant predictive filtering (SPFilt) that adaptively predicts proper
kernels via a deep network to filter different individual pixels. Since the
filtering can be implemented via well-accelerated convolution, our method can
be significantly efficient. We further propose the EfDeRain+ that contains
three main contributions to address residual rain traces, multi-scale, and
diverse rain patterns without harming the efficiency. First, we propose the
uncertainty-aware cascaded predictive filtering (UC-PFilt) that can identify
the difficulties of reconstructing clean pixels via predicted kernels and
remove the residual rain traces effectively. Second, we design the
weight-sharing multi-scale dilated filtering (WS-MS-DFilt) to handle
multi-scale rain streaks without harming the efficiency. Third, to eliminate
the gap across diverse rain patterns, we propose a novel data augmentation
method (i.e., RainMix) to train our deep models. By combining all contributions
with sophisticated analysis on different variants, our final method outperforms
baseline methods on four single-image deraining datasets and one video
deraining dataset in terms of both recovery quality and speed.
【66】 Motion Prediction via Joint Dependency Modeling in Phase Space
标题:基于相空间联合依赖建模的运动预测
作者:Pengxiang Su,Zhenguang Liu,Shuang Wu,Lei Zhu,Yifang Yin,Xuanjing Shen
摘要:Motion prediction is a classic problem in computer vision, which aims at
forecasting future motion given the observed pose sequence. Various deep
learning models have been proposed, achieving state-of-the-art performance on
motion prediction. However, existing methods typically focus on modeling
temporal dynamics in the pose space. Unfortunately, the complicated and high
dimensionality nature of human motion brings inherent challenges for dynamic
context capturing. Therefore, we move away from the conventional pose based
representation and present a novel approach employing a phase space trajectory
representation of individual joints. Moreover, current methods tend to only
consider the dependencies between physically connected joints. In this paper,
we introduce a novel convolutional neural model to effectively leverage
explicit prior knowledge of motion anatomy, and simultaneously capture both
spatial and temporal information of joint trajectory dynamics. We then propose
a global optimization module that learns the implicit relationships between
individual joint features.
Empirically, our method is evaluated on large-scale 3D human motion benchmark
datasets (i.e., Human3.6M, CMU MoCap). These results demonstrate that our
method sets the new state-of-the-art on the benchmark datasets. Our code will
be available at https://github.com/Pose-Group/TEID.
【67】 Towards Trustworthy DeFi Oracles: Past,Present and Future
标题:走向值得信赖的德菲甲骨文:过去、现在和未来
作者:Yinjie Zhao,Xin Kang,Tieyan Li,Cheng-Kang Chu,Haiguang Wang
备注:Under review
摘要:With the rapid development of blockchain technology in recent years, all
kinds of blockchain-based applications have emerged. Among them, the
decentralized finance (DeFi) is one of the most successful applications, which
is regarded as the future of finance. The great success of DeFi relies on the
real-world data which is not directly available on the blockchain. Besides, due
to the deterministic nature of blockchain,the blockchain cannot directly obtain
in-deterministic data from the outside world (off-chain). Thus, oracles have
appeared as a viable solution to feed off-chain data to blockchain
applications. In this paper, we carryout a comprehensive study on oracles,
especially on DeFi oracles. We first briefly introduce the application
scenarios of DeFi oracles, and then we talk about the past of DeFi oracles by
categorizing them into several types based on their design features. After
that, we introduce five popular DeFi oracles currently in use(such as Chainlink
and Band Protocol), with the focus on their system architecture, data
validation process,and their incentive mechanisms. We compare these present
DeFi oracles from their data trustworthiness,data source trustworthiness and
their overall trust models. Finally, we propose a set of metrics for designing
trustworthiness DeFi oracles, and propose a potential trust architecture and a
few promising techniques for building trustworthiness oracles.
【68】 GenLabel: Mixup Relabeling using Generative Models
标题:GenLabel:使用产生式模型的混合重标记
作者:Jy-yong Sohn,Liang Shang,Hongxu Chen,Jaekyun Moon,Dimitris Papailiopoulos,Kangwook Lee
摘要:Mixup is a data augmentation method that generates new data points by mixing
a pair of input data. While mixup generally improves the prediction
performance, it sometimes degrades the performance. In this paper, we first
identify the main causes of this phenomenon by theoretically and empirically
analyzing the mixup algorithm. To resolve this, we propose GenLabel, a simple
yet effective relabeling algorithm designed for mixup. In particular, GenLabel
helps the mixup algorithm correctly label mixup samples by learning the
class-conditional data distribution using generative models. Via extensive
theoretical and empirical analysis, we show that mixup, when used together with
GenLabel, can effectively resolve the aforementioned phenomenon, improving the
generalization performance and the adversarial robustness.
【69】 Degrees of Freedom Analysis of Mechanisms using the New Zebra Crossing Method
标题:用新斑马线交叉法进行机构自由度分析
作者:Rajashekhar V S,Debasish Ghose
备注:31 pages and 17 figures
摘要:Mobility, which is a basic property for a mechanism has to be analyzed to
find the degrees of freedom. A quick method for calculation of degrees of
freedom in a mechanism is proposed in this work. The mechanism is represented
in a way that resembles a zebra crossing. An algorithm is proposed which is
used to determine the mobility from the zebra crossing diagram. This algorithm
takes into account the number of patches between the black patches, the number
of joints attached to the fixed link and the number of loops in the mechanism.
A number of cases have been discussed which fail to give the desired results
using the widely used classical Kutzbach-Grubler formula.
【70】 Asymptotic Security using Bayesian Defense Mechanisms with Application to Cyber Deception
标题:基于贝叶斯防御机制的渐近安全性及其在网络欺骗中的应用
作者:Hampei Sasahara,Henrik Sandberg
备注:16 pages
摘要:This study addresses the question whether model knowledge can prevent a
defender from being deceived or not in cyber security. As a specific
model-based defense scheme, this study treats Bayesian defense mechanism, which
monitors the system's behavior, forms a belief on existence of the attacker,
and chooses appropriate reactions. Sophisticated attackers aim at achieving her
objective while avoiding being detected by deceiving the defender. In this
paper, their dynamic decision making is formulated as a stochastic signaling
game. It is revealed that the belief on the true scenario has a limit in a
stochastic sense at an equilibrium based on martingale analysis. This fact
implies that there are only two possible cases: the defender asymptotically
detects the attack with a firm belief or the attacker takes actions such that
the system's behavior becomes nominal after a certain finite time step.
Consequently, if the dynamics admits no stealthy attacks, the system is
guaranteed to be secure in an asymptotic manner provided that effective
countermeasures are implemented. The result concludes that model knowledge can
prevent deception in an asymptotic sense. As an application of the finding, a
defensive deception utilizing asymmetric recognition on vulnerabilities
exploited by the attacker is analyzed. It is shown that, the attacker possibly
stops the attack even if the defender is unaware of the vulnerabilities as long
as the defender's unawareness is concealed by the defensive deception. Those
results indicate the powerful defense capability achieved by model knowledge.
【71】 The Study of Peer Assessment Impact on Group Learning Activities
标题:同伴评价对小组学习活动的影响研究
作者:Zhiyuan Chen,Soon Boon Lee,Shazia Paras Shaikh,Mirza Rayana Sanzana
备注:Regular Research Paper Accepted by FECS'21 (The 17th Int'l Conf on Frontiers in Education: Computer Science and Computer Engineering)
摘要:Comparing with lecturer marked assessments, peer assessment is a more
comprehensive learning process and many of the associated problems have
occurred. In this research work, we study the peer-assessment impact on group
learning activities in order to provide a complete and systematic review,
increase the practice and quality of the peer assessment process. Pilot studies
were conducted and took the form of surveys, focus group interviews, and
questionnaires. Prelimi-nary surveys were conducted with 582 students and 276
responses were received, giving a response rate of 47.4%. The results show 37%
student will choose individual work over group work if given the choice. In the
case study, 82.1% of the total of 28 students have en-joyed working in a group
using Facebook as communication tools. 89.3% of the students can demonstrate
their skills through group-working and most importantly, 82.1% of them agree
that peer assess-ment is an impartial method of assessment with the help of
Facebook as proof of self-contribution. Our suggestions to make group work a
pleasant experience are by identifying and taking action against the
freeloader, giving credit to the deserving students, educating students on how
to give constructive feedback and making the assessment pro-cess transparent to
all.
【72】 SaL-Lightning Dataset: Search and Eye Gaze Behavior, Resource Interactions and Knowledge Gain during Web Search
标题:SAL-Lightning数据集:网络搜索期间的搜索和眼睛注视行为、资源交互和知识获取
作者:Christian Otto,Markus Rokicki,Georg Pardi,Wolfgang Gritz,Daniel Hienert,Ran Yu,Johannes von Hoyer,Anett Hoppe,Stefan Dietze,Peter Holtz,Yvonne Kammerer,Ralph Ewerth
备注:To be published at the 2022 ACM SIGIR Conference on Human Information Interaction and Retrieval (CHIIR '22)
摘要:The emerging research field Search as Learning investigates how the Web
facilitates learning through modern information retrieval systems. SAL research
requires significant amounts of data that capture both search behavior of users
and their acquired knowledge in order to obtain conclusive insights or train
supervised machine learning models. However, the creation of such datasets is
costly and requires interdisciplinary efforts in order to design studies and
capture a wide range of features. In this paper, we address this issue and
introduce an extensive dataset based on a user study, in which $114$
participants were asked to learn about the formation of lightning and thunder.
Participants' knowledge states were measured before and after Web search
through multiple-choice questionnaires and essay-based free recall tasks. To
enable future research in SAL-related tasks we recorded a plethora of features
and person-related attributes. Besides the screen recordings, visited Web
pages, and detailed browsing histories, a large number of behavioral features
and resource features were monitored. We underline the usefulness of the
dataset by describing three, already published, use cases.
【73】 Decision problem of some bundled FOML fragments
标题:若干捆绑FOML片段的判定问题
作者:Mo Liu
摘要:Over increasing domain interpretations, \exists\Box and \forall\Box bundled
fragments are decidable and over constant domain interpretations, \exists\Box
bundled fragment is decidable while \forall\Box bundled fragment is
undecidable. Based on the existing results,we show that over increasing domain
interpretations, \Box\exists and \Box\forall bundled fragments are decidable as
well. On the other hand, over constant domain interpretations, \Box\forall
bundled fragment is undecidable and \Box\exists^2 bundled fragment, an
extension of \Box\exists bundled fragment, is undecidable neither.
【74】 iDECODe: In-distribution Equivariance for Conformal Out-of-distribution Detection
标题:IDECODe:用于共形分布外检测的分布内等差
作者:Ramneet Kaur,Susmit Jha,Anirban Roy,Sangdon Park,Edgar Dobriban,Oleg Sokolsky,Insup Lee
备注:Association for the Advancement of Artificial Intelligence (AAAI), 2022
摘要:Machine learning methods such as deep neural networks (DNNs), despite their
success across different domains, are known to often generate incorrect
predictions with high confidence on inputs outside their training distribution.
The deployment of DNNs in safety-critical domains requires detection of
out-of-distribution (OOD) data so that DNNs can abstain from making predictions
on those. A number of methods have been recently developed for OOD detection,
but there is still room for improvement. We propose the new method iDECODe,
leveraging in-distribution equivariance for conformal OOD detection. It relies
on a novel base non-conformity measure and a new aggregation method, used in
the inductive conformal anomaly detection framework, thereby guaranteeing a
bounded false detection rate. We demonstrate the efficacy of iDECODe by
experiments on image and audio datasets, obtaining state-of-the-art results. We
also show that iDECODe can detect adversarial examples.
【75】 On the Effectiveness of Sampled Softmax Loss for Item Recommendation
标题:抽样软最大损失在项目推荐中的有效性研究
作者:Jiancan Wu,Xiang Wang,Xingyu Gao,Jiawei Chen,Hongcheng Fu,Tianyu Qiu,Xiangnan He
备注:10 Pages, 1 figure, 5 tables
摘要:Learning objectives of recommender models remain largely unexplored. Most
methods routinely adopt either pointwise or pairwise loss to train the model
parameters, while rarely pay attention to softmax loss due to the high
computational cost. Sampled softmax loss emerges as an efficient substitute for
softmax loss. Its special case, InfoNCE loss, has been widely used in
self-supervised learning and exhibited remarkable performance for contrastive
learning. Nonetheless, limited studies use sampled softmax loss as the learning
objective to train the recommender. Worse still, none of them explore its
properties and answer "Does sampled softmax loss suit for item recommendation?"
and "What are the conceptual advantages of sampled softmax loss, as compared
with the prevalent losses?", to the best of our knowledge. In this work, we aim
to better understand sampled softmax loss for item recommendation.
Specifically, we first theoretically reveal three model-agnostic advantages:
(1) mitigating popularity bias, which is beneficial to long-tail
recommendation; (2) mining hard negative samples, which offers informative
gradients to optimize model parameters; and (3) maximizing the ranking metric,
which facilitates top-K performance. Moreover, we probe the model-specific
characteristics on the top of various recommenders. Experimental results
suggest that sampled softmax loss is more friendly to history and graph-based
recommenders (e.g., SVD++ and LightGCN), but performs poorly for ID-based
models (e.g., MF). We ascribe this to its shortcoming in learning
representation magnitude, making the combination with the models that are also
incapable of adjusting representation magnitude learn poor representations. In
contrast, the history- and graph-based models, which naturally adjust
representation magnitude according to node degree, are able to compensate for
the shortcoming of sampled softmax loss.
【76】 Distributed Nash Equilibrium Seeking over Time-Varying Directed Communication Networks
标题:时变有向通信网络上的分布式纳什均衡求解
作者:Duong Thuy Anh Nguyen,Duong Tung Nguyen,Angelia Nedić
摘要:We study distributed algorithms for finding a Nash equilibrium (NE) in a
class of non-cooperative convex games under partial information. Specifically,
each agent has access only to its own smooth local cost function and can
receive information from its neighbors in a time-varying directed communication
network. To this end, we propose a distributed gradient play algorithm to
compute a NE by utilizing local information exchange among the players. In this
algorithm, every agent performs a gradient step to minimize its own cost
function while sharing and retrieving information locally among its neighbors.
The existing methods impose strong assumptions such as balancedness of the
mixing matrices and global knowledge of the network communication structure,
including Perron-Frobenius eigenvector of the adjacency matrix and other graph
connectivity constants. In contrast, our approach relies only on a reasonable
and widely-used assumption of row-stochasticity of the mixing matrices. We
analyze the algorithm for time-varying directed graphs and prove its
convergence to the NE, when the agents' cost functions are strongly convex and
have Lipschitz continuous gradients. Numerical simulations are performed for a
Nash-Cournot game to illustrate the efficacy of the proposed algorithm.
【77】 An Unsupervised Masking Objective for Abstractive Multi-Document News Summarization
标题:一种面向抽象多文档新闻摘要的无监督掩蔽目标
作者:Nikolai Vogler,Songlin Li,Yujie Xu,Yujian Mi,Taylor Berg-Kirkpatrick
摘要:We show that a simple unsupervised masking objective can approach near
supervised performance on abstractive multi-document news summarization. Our
method trains a state-of-the-art neural summarization model to predict the
masked out source document with highest lexical centrality relative to the
multi-document group. In experiments on the Multi-News dataset, our masked
training objective yields a system that outperforms past unsupervised methods
and, in human evaluation, surpasses the best supervised method without
requiring access to any ground-truth summaries. Further, we evaluate how
different measures of lexical centrality, inspired by past work on extractive
summarization, affect final performance.
【78】 A Transfer Learning Pipeline for Educational Resource Discovery with Application in Leading Paragraph Generation
标题:教育资源发现的迁移学习流水线及其在前导段落生成中的应用
作者:Irene Li,Thomas George,Alexander Fabbri,Tammy Liao,Benjamin Chen,Rina Kawamura,Richard Zhou,Vanessa Yan,Swapnil Hingmire,Dragomir Radev
摘要:Effective human learning depends on a wide selection of educational materials
that align with the learner's current understanding of the topic. While the
Internet has revolutionized human learning or education, a substantial resource
accessibility barrier still exists. Namely, the excess of online information
can make it challenging to navigate and discover high-quality learning
materials. In this paper, we propose the educational resource discovery (ERD)
pipeline that automates web resource discovery for novel domains. The pipeline
consists of three main steps: data collection, feature extraction, and resource
classification. We start with a known source domain and conduct resource
discovery on two unseen target domains via transfer learning. We first collect
frequent queries from a set of seed documents and search on the web to obtain
candidate resources, such as lecture slides and introductory blog posts. Then
we introduce a novel pretrained information retrieval deep neural network
model, query-document masked language modeling (QD-MLM), to extract deep
features of these candidate resources. We apply a tree-based classifier to
decide whether the candidate is a positive learning resource. The pipeline
achieves F1 scores of 0.94 and 0.82 when evaluated on two similar but novel
target domains. Finally, we demonstrate how this pipeline can benefit an
application: leading paragraph generation for surveys. This is the first study
that considers various web resources for survey generation, to the best of our
knowledge. We also release a corpus of 39,728 manually labeled web resources
and 659 queries from NLP, Computer Vision (CV), and Statistics (STATS).
【79】 Multi-Behavior Enhanced Recommendation with Cross-Interaction Collaborative Relation Modeling
标题:基于交叉交互协同关系建模的多行为增强推荐
作者:Lianghao Xia,Chao Huang,Yong Xu,Peng Dai,Mengyin Lu,Liefeng Bo
备注:Published on ICDE 2021
摘要:Many previous studies aim to augment collaborative filtering with deep neural
network techniques, so as to achieve better recommendation performance.
However, most existing deep learning-based recommender systems are designed for
modeling singular type of user-item interaction behavior, which can hardly
distill the heterogeneous relations between user and item. In practical
recommendation scenarios, there exist multityped user behaviors, such as browse
and purchase. Due to the overlook of user's multi-behavioral patterns over
different items, existing recommendation methods are insufficient to capture
heterogeneous collaborative signals from user multi-behavior data. Inspired by
the strength of graph neural networks for structured data modeling, this work
proposes a Graph Neural Multi-Behavior Enhanced Recommendation (GNMR) framework
which explicitly models the dependencies between different types of user-item
interactions under a graph-based message passing architecture. GNMR devises a
relation aggregation network to model interaction heterogeneity, and
recursively performs embedding propagation between neighboring nodes over the
user-item interaction graph. Experiments on real-world recommendation datasets
show that our GNMR consistently outperforms state-of-the-art methods. The
source code is available at https://github.com/akaxlh/GNMR.
【80】 Learning Multi-Tasks with Inconsistent Labels by using Auxiliary Big Task
标题:利用辅助大任务学习标签不一致的多任务
作者:Quan Feng,Songcan Chen
摘要:Multi-task learning is to improve the performance of the model by
transferring and exploiting common knowledge among tasks. Existing MTL works
mainly focus on the scenario where label sets among multiple tasks (MTs) are
usually the same, thus they can be utilized for learning across the tasks.
While almost rare works explore the scenario where each task only has a small
amount of training samples, and their label sets are just partially overlapped
or even not. Learning such MTs is more challenging because of less correlation
information available among these tasks. For this, we propose a framework to
learn these tasks by jointly leveraging both abundant information from a learnt
auxiliary big task with sufficiently many classes to cover those of all these
tasks and the information shared among those partially-overlapped tasks. In our
implementation of using the same neural network architecture of the learnt
auxiliary task to learn individual tasks, the key idea is to utilize available
label information to adaptively prune the hidden layer neurons of the auxiliary
network to construct corresponding network for each task, while accompanying a
joint learning across individual tasks. Our experimental results demonstrate
its effectiveness in comparison with the state-of-the-art approaches.
【81】 Budget-aware Few-shot Learning via Graph Convolutional Network
标题:基于图卷积网络的预算感知小概率学习
作者:Shipeng Yan,Songyang Zhang,Xuming He
摘要:This paper tackles the problem of few-shot learning, which aims to learn new
visual concepts from a few examples. A common problem setting in few-shot
classification assumes random sampling strategy in acquiring data labels, which
is inefficient in practical applications. In this work, we introduce a new
budget-aware few-shot learning problem that not only aims to learn novel object
categories, but also needs to select informative examples to annotate in order
to achieve data efficiency.
We develop a meta-learning strategy for our budget-aware few-shot learning
task, which jointly learns a novel data selection policy based on a Graph
Convolutional Network (GCN) and an example-based few-shot classifier. Our
selection policy computes a context-sensitive representation for each unlabeled
data by graph message passing, which is then used to predict an informativeness
score for sequential selection. We validate our method by extensive experiments
on the mini-ImageNet, tiered-ImageNet and Omniglot datasets. The results show
our few-shot learning strategy outperforms baselines by a sizable margin, which
demonstrates the efficacy of our method.
【82】 From Textual Experiments to Experimental Texts: Expressive Repetition in "Artificial Intelligence Literature"
作者:Tianhua Zhu
备注:12 pages; to appear on SASS Studies, 2021 Winter. This is an English version; please consider citing the original paper in Chinese
摘要:Since the birth of artificial intelligence 70 years ago, attempts at literary
"creation" with computers are present in the course of technological
development, creating what one might call "artificial intelligence literature"
(AI literature). Evolving from "textual experiments" conducted by technologists
to "experimental texts" that explore the possibilities of conceptions of
literature, AI literature integrates primitive problems including machine
thinking, text generation, and machine creativity, which exhibits the two-way
interaction between social ideas and technology. In the early stage, the mutual
support between technological path and artistic ideas turned out to be a
failure, while AI-driven expressive repetitions are made probable in the
contemporary technological context, paving the way for the transformation of AI
literature from proof for technical possibilities to self-verification of
literary value.
【83】 Extending One-Stage Detection with Open-World Proposals
标题:利用开放世界方案扩展一阶段检测
作者:Sachin Konan,Kevin J Liang,Li Yin
摘要:In many applications, such as autonomous driving, hand manipulation, or robot
navigation, object detection methods must be able to detect objects unseen in
the training set. Open World Detection(OWD) seeks to tackle this problem by
generalizing detection performance to seen and unseen class categories. Recent
works have seen success in the generation of class-agnostic proposals, which we
call Open-World Proposals(OWP), but this comes at the cost of a big drop on the
classification task when both tasks are considered in the detection model.
These works have investigated two-stage Region Proposal Networks (RPN) by
taking advantage of objectness scoring cues; however, for its simplicity,
run-time, and decoupling of localization and classification, we investigate OWP
through the lens of fully convolutional one-stage detection network, such as
FCOS. We show that our architectural and sampling optimizations on FCOS can
increase OWP performance by as much as 6% in recall on novel classes, marking
the first proposal-free one-stage detection network to achieve comparable
performance to RPN-based two-stage networks. Furthermore, we show that the
inherent, decoupled architecture of FCOS has benefits to retaining
classification performance. While two-stage methods worsen by 6% in recall on
novel classes, we show that FCOS only drops 2% when jointly optimizing for OWP
and classification.
【84】 Time Series Forecasting Using Fuzzy Cognitive Maps: A Survey
标题:基于模糊认知图的时间序列预测研究综述
作者:Omid Orang,Petrônio Cândido de Lima e Silva,Frederico Guimarães Gadelha
摘要:Among various soft computing approaches for time series forecasting, Fuzzy
Cognitive Maps (FCM) have shown remarkable results as a tool to model and
analyze the dynamics of complex systems. FCM have similarities to recurrent
neural networks and can be classified as a neuro-fuzzy method. In other words,
FCMs are a mixture of fuzzy logic, neural network, and expert system aspects,
which act as a powerful tool for simulating and studying the dynamic behavior
of complex systems. The most interesting features are knowledge
interpretability, dynamic characteristics and learning capability. The goal of
this survey paper is mainly to present an overview on the most relevant and
recent FCM-based time series forecasting models proposed in the literature. In
addition, this article considers an introduction on the fundamentals of FCM
model and learning methodologies. Also, this survey provides some ideas for
future research to enhance the capabilities of FCM in order to cover some
challenges in the real-world experiments such as handling non-stationary data
and scalability issues. Moreover, equipping FCMs with fast learning algorithms
is one of the major concerns in this area.
【85】 Delay Alignment Modulation: Enabling Equalization-Free Single-Carrier Communication
标题:延迟对齐调制:实现无均衡单载波通信
作者:Haiquan Lu,Yong Zeng
备注:5 pages, 6 figures
摘要:This paper proposes a novel broadband transmission technology, termed delay
alignment modulation (DAM), which enables the low-complexity equalization-free
single-carrier communication, yet without suffering from inter-symbol
interference (ISI). The key idea of DAM is to deliberately introduce
appropriate delays for information-bearing symbols at the transmitter side, so
that after propagating over the time-dispersive channel, all multi-path signal
components will arrive at the receiver simultaneously and constructively. We
first show that by applying DAM for the basic multiple-input single-output
(MISO) communication system, an ISI-free additive white Gaussian noise (AWGN)
system can be obtained with the simple zero-forcing (ZF) beamforming.
Furthermore, the more general DAM scheme is studied with the ISI-maximal-ratio
transmission (MRT) and the ISI-minimum mean-square error (MMSE) beamforming.
Simulation results are provided to show that when the channel is sparse and/or
the antenna dimension is large, DAM not only resolves the notorious practical
issues suffered by orthogonal frequency-division multiplexing (OFDM) such as
high peak-to-average-power ratio (PAPR), severe out-of-band (OOB) emission, and
vulnerability to carrier frequency offset (CFO), with low complexity, but also
achieves higher spectral efficiency due to the saving of guard interval
overhead.
【86】 Voltage-Based State of Charge Correction at Charge-End
作者:Ali Abdollahi,Jianwei Li,Xiaojun Li,Trevor Jones,Asif Habeebullah
摘要:A voltage-based method is proposed to correct battery pack state of charge
(SOC) estimation at the charge-end. Two main characteristics make the
charge-end time span a good opportunity to correct SOC estimation: first, it is
easy to detect when the battery is at the last stage of charging because the
charging profile is known to the BMS designer and also during the charge-end
time span the amount of current is low, and the terminal voltage of the battery
cells are high; second, as the battery reaches the charge-end stage, we know
that the true SOC is approaching to 100%. This paper presents a method to
utilize these important features to correct the SOC estimation error. Using a
voltage threshold method, the algorithm detects when the battery is close to
the charge-end to activate the charge-end SOC correction strategy. Once
activated, the strategy corrects the SOC using the maximum cell voltage to
guarantee that SOC is 100% when charging is complete. The amount of correction
is a function of maximum cell voltage and the charge current C-rate.
【87】 Repurposing Existing Deep Networks for Caption and Aesthetic-Guided Image Cropping
标题:重新利用现有的深层网络进行字幕和美学引导的图像裁剪
作者:Nora Horanyi,Kedi Xia,Kwang Moo Yi,Abhishake Kumar Bojja,Ales Leonardis,Hyung Jin Chang
备注:None
摘要:We propose a novel optimization framework that crops a given image based on
user description and aesthetics. Unlike existing image cropping methods, where
one typically trains a deep network to regress to crop parameters or cropping
actions, we propose to directly optimize for the cropping parameters by
repurposing pre-trained networks on image captioning and aesthetic tasks,
without any fine-tuning, thereby avoiding training a separate network.
Specifically, we search for the best crop parameters that minimize a combined
loss of the initial objectives of these networks. To make the optimization
table, we propose three strategies: (i) multi-scale bilinear sampling, (ii)
annealing the scale of the crop region, therefore effectively reducing the
parameter space, (iii) aggregation of multiple optimization results. Through
various quantitative and qualitative evaluations, we show that our framework
can produce crops that are well-aligned to intended user descriptions and
aesthetically pleasing.
【88】 De-rendering 3D Objects in the Wild
标题:在野外取消渲染3D对象
作者:Felix Wimbauer,Shangzhe Wu,Christian Rupprecht
摘要:With increasing focus on augmented and virtual reality applications (XR)
comes the demand for algorithms that can lift objects from images and videos
into representations that are suitable for a wide variety of related 3D tasks.
Large-scale deployment of XR devices and applications means that we cannot
solely rely on supervised learning, as collecting and annotating data for the
unlimited variety of objects in the real world is infeasible. We present a
weakly supervised method that is able to decompose a single image of an object
into shape (depth and normals), material (albedo, reflectivity and shininess)
and global lighting parameters. For training, the method only relies on a rough
initial shape estimate of the training objects to bootstrap the learning
process. This shape supervision can come for example from a pretrained depth
network or - more generically - from a traditional structure-from-motion
pipeline. In our experiments, we show that the method can successfully
de-render 2D images into a decomposed 3D representation and generalizes to
unseen object categories. Since in-the-wild evaluation is difficult due to the
lack of ground truth data, we also introduce a photo-realistic synthetic test
set that allows for quantitative evaluation.
【89】 Investigating Expectation Violations in Mobile Apps
标题:调查移动应用中的预期违规行为
作者:Sherlock A. Licorish,Helen E. Owen,Bastin Tony Roy Savarimuthu,Priyanka Patel
备注:32 pages, 4 figures, 8 tables
摘要:Information technology and software services are pervasive, occupying the
centre of most aspects of contemporary societies. This has given rise to
commonly expected norms and expectations around how such systems should work,
appropriate penalties for violating these expectations, and more importantly,
indicators of how to reduce the consequences of violations and sanctions.
Evidence for expectation violations and ensuing sanctions exists in a range of
portals used by individuals and groups to start new friendships, explore new
ideas, and provide feedback for products and services. Therein lies insights
that could lead to functional socio-technical systems, and general awareness
and anticipations of human actions (and interactions) when using information
technology and software services. However, limited previous work has examined
such artifacts to provide these understandings. To contribute to such
understandings and theoretical advancement we study expectation violations in
mobile apps, considered among the most engaging socio-technical systems. We
used content analysis and expectancy violation theory (EVT) and expectation
confirmation theory (ECT) to explore the evidence and nature of sanctions in
app reviews for a specific domain of apps. Our outcomes show that users respond
to expectation violation with sanctions when their app does not work as
anticipated, developers seem to target specific market niches when providing
services in an app domain, and users within an app domain respond with similar
sanctions. We contribute to the advancement of expectation violation theories,
and we provide practical insights for the mobile app community.
【90】 Learning to be adversarially robust and differentially private
标题:学会变得相反的健壮和与众不同的私密
作者:Jamie Hayes,Borja Balle,M. Pawan Kumar
备注:Preliminary work appeared at PPML 2021
摘要:We study the difficulties in learning that arise from robust and
differentially private optimization. We first study convergence of gradient
descent based adversarial training with differential privacy, taking a simple
binary classification task on linearly separable data as an illustrative
example. We compare the gap between adversarial and nominal risk in both
private and non-private settings, showing that the data dimensionality
dependent term introduced by private optimization compounds the difficulties of
learning a robust model. After this, we discuss what parts of adversarial
training and differential privacy hurt optimization, identifying that the size
of adversarial perturbation and clipping norm in differential privacy both
increase the curvature of the loss landscape, implying poorer generalization
performance.
【91】 ITSA: An Information-Theoretic Approach to Automatic Shortcut Avoidance and Domain Generalization in Stereo Matching Networks
标题:ITSA:立体匹配网络中自动回避捷径和区域泛化的信息论方法
作者:WeiQin Chuah,Ruwan Tennakoon,Reza Hoseinnezhad,Alireza Bab-Hadiashar,David Suter
备注:11 pages, 4 figures
摘要:State-of-the-art stereo matching networks trained only on synthetic data
often fail to generalize to more challenging real data domains. In this paper,
we attempt to unfold an important factor that hinders the networks from
generalizing across domains: through the lens of shortcut learning. We
demonstrate that the learning of feature representations in stereo matching
networks is heavily influenced by synthetic data artefacts (shortcut
attributes). To mitigate this issue, we propose an Information-Theoretic
Shortcut Avoidance~(ITSA) approach to automatically restrict shortcut-related
information from being encoded into the feature representations. As a result,
our proposed method learns robust and shortcut-invariant features by minimizing
the sensitivity of latent features to input variations. To avoid the
prohibitive computational cost of direct input sensitivity optimization, we
propose an effective yet feasible algorithm to achieve robustness. We show that
using this method, state-of-the-art stereo matching networks that are trained
purely on synthetic data can effectively generalize to challenging and
previously unseen real data scenarios. Importantly, the proposed method
enhances the robustness of the synthetic trained networks to the point that
they outperform their fine-tuned counterparts (on real data) for challenging
out-of-domain stereo datasets.
【92】 A unified software/hardware scalable architecture for brain-inspired computing based on self-organizing neural models
标题:基于自组织神经模型的脑启发计算软硬件统一可扩展体系结构
作者:Artem R. Muliukov,Laurent Rodriguez,Benoit Miramond,Lyes Khacef,Joachim Schmidt,Quentin Berthet,Andres Upegui
摘要:The field of artificial intelligence has significantly advanced over the past
decades, inspired by discoveries from the fields of biology and neuroscience.
The idea of this work is inspired by the process of self-organization of
cortical areas in the human brain from both afferent and lateral/internal
connections. In this work, we develop an original brain-inspired neural model
associating Self-Organizing Maps (SOM) and Hebbian learning in the Reentrant
SOM (ReSOM) model. The framework is applied to multimodal classification
problems. Compared to existing methods based on unsupervised learning with
post-labeling, the model enhances the state-of-the-art results. This work also
demonstrates the distributed and scalable nature of the model through both
simulation results and hardware execution on a dedicated FPGA-based platform
named SCALP (Self-configurable 3D Cellular Adaptive Platform). SCALP boards can
be interconnected in a modular way to support the structure of the neural
model. Such a unified software and hardware approach enables the processing to
be scaled and allows information from several modalities to be merged
dynamically. The deployment on hardware boards provides performance results of
parallel execution on several devices, with the communication between each
board through dedicated serial links. The proposed unified architecture,
composed of the ReSOM model and the SCALP hardware platform, demonstrates a
significant increase in accuracy thanks to multimodal association, and a good
trade-off between latency and power consumption compared to a centralized GPU
implementation.
【93】 CitySurfaces: City-Scale Semantic Segmentation of Sidewalk Materials
标题:CitySurfaces:人行道材质的城市尺度语义分割
作者:Maryam Hosseini,Fabio Miranda,Jianzhe Lin,Claudio Silva
备注:Sustainable Cities and Society journal (accepted); Model: this https URL
摘要:While designing sustainable and resilient urban built environment is
increasingly promoted around the world, significant data gaps have made
research on pressing sustainability issues challenging to carry out. Pavements
are known to have strong economic and environmental impacts; however, most
cities lack a spatial catalog of their surfaces due to the cost-prohibitive and
time-consuming nature of data collection. Recent advancements in computer
vision, together with the availability of street-level images, provide new
opportunities for cities to extract large-scale built environment data with
lower implementation costs and higher accuracy. In this paper, we propose
CitySurfaces, an active learning-based framework that leverages computer vision
techniques for classifying sidewalk materials using widely available
street-level images. We trained the framework on images from New York City and
Boston and the evaluation results show a 90.5% mIoU score. Furthermore, we
evaluated the framework using images from six different cities, demonstrating
that it can be applied to regions with distinct urban fabrics, even outside the
domain of the training data. CitySurfaces can provide researchers and city
agencies with a low-cost, accurate, and extensible method to collect sidewalk
material data which plays a critical role in addressing major sustainability
issues, including climate change and surface water management.
【94】 Applying Word Embeddings to Measure Valence in Information Operations Targeting Journalists in Brazil
标题:在巴西以记者为目标的信息操作中应用词嵌入来衡量价位
作者:David A. Broniatowski
摘要:Among the goals of information operations are to change the overall
information environment vis-\'a-vis specific actors. For example, "trolling
campaigns" seek to undermine the credibility of specific public figures,
leading others to distrust them and intimidating these figures into silence. To
accomplish these aims, information operations frequently make use of "trolls"
-- malicious online actors who target verbal abuse at these figures. In Brazil,
in particular, allies of Brazil's current president have been accused of
operating a "hate cabinet" -- a trolling operation that targets journalists who
have alleged corruption by this politician and other members of his regime.
Leading approaches to detecting harmful speech, such as Google's Perspective
API, seek to identify specific messages with harmful content. While this
approach is helpful in identifying content to downrank, flag, or remove, it is
known to be brittle, and may miss attempts to introduce more subtle biases into
the discourse. Here, we aim to develop a measure that might be used to assess
how targeted information operations seek to change the overall valence, or
appraisal, of specific actors. Preliminary results suggest known campaigns
target female journalists more so than male journalists, and that these
campaigns may leave detectable traces in overall Twitter discourse.
【95】 Data-Efficient Learning of High-Quality Controls for Kinodynamic Planning used in Vehicular Navigation
标题:用于车辆导航的高质量运动规划控制的数据高效学习
作者:Seth Karten,Aravind Sivaramakrishnan,Edgar Granados,Troy McMahon,Kostas E. Bekris
备注:None
摘要:This paper aims to improve the path quality and computational efficiency of
kinodynamic planners used for vehicular systems. It proposes a learning
framework for identifying promising controls during the expansion process of
sampling-based motion planners for systems with dynamics. Offline, the learning
process is trained to return the highest-quality control that reaches a local
goal state (i.e., a waypoint) in the absence of obstacles from an input
difference vector between its current state and a local goal state. The data
generation scheme provides bounds on the target dispersion and uses state space
pruning to ensure high-quality controls. By focusing on the system's dynamics,
this process is data efficient and takes place once for a dynamical system, so
that it can be used for different environments with modular expansion
functions. This work integrates the proposed learning process with a) an
exploratory expansion function that generates waypoints with biased coverage
over the reachable space, and b) proposes an exploitative expansion function
for mobile robots, which generates waypoints using medial axis information.
This paper evaluates the learning process and the corresponding planners for a
first and second-order differential drive systems. The results show that the
proposed integration of learning and planning can produce better quality paths
than kinodynamic planning with random controls in fewer iterations and
computation time.
【96】 A Taxonomy of Social VR Design
标题:社会虚拟现实设计的一种分类学
作者:Douglas Zytko,Ryan Handley,Bert Guerra,Rukkmini Goli
摘要:Social VR has experienced tremendous growth in the commercial space recently
as an emerging technology for rich interactions themed around leisure, work,
and relationship building. As a result, the state of social VR application
design has become rapidly obfuscated, which complicates identification of
design trends and uncommon features that could inform future design, and
hinders inclusion of new voices in this design space. To help address this
problem, we present a taxonomy of social VR application design choices as
informed by 44 commercial and prototypical applications. Our taxonomy was
informed by multiple discovery strategies including literature review, search
of VR-themed subreddits, and autobiographical landscape research. The taxonomy
elucidates various features across three design areas: the self, interaction,
and the environment.
【97】 Efficient Algebraic Two-Level Schwarz Preconditioner For Sparse Matrices
作者:Hussam Al Daas,Pierre Jolivet,Tyrone Rees
摘要:Domain decomposition methods are among the most efficient for solving sparse
linear systems of equations. Their effectiveness relies on a judiciously chosen
coarse space. Originally introduced and theoretically proved to be efficient
for self-adjoint operators, spectral coarse spaces have been proposed in the
past few years for indefinite and non-self-adjoint operators. This paper
presents a new spectral coarse space that can be constructed in a
fully-algebraic way unlike most existing spectral coarse spaces. We present
theoretical convergence result for Hermitian positive definite diagonally
dominant matrices. Numerical experiments and comparisons against
state-of-the-art preconditioners in the multigrid community show that the
resulting two-level Schwarz preconditioner is efficient especially for
non-self-adjoint operators. Furthermore, in this case, our proposed
preconditioner outperforms state-of-the-art preconditioners.
【98】 Fixation Maximization in the Positional Moran Process
标题:位置性Moran过程中的注视最大化
作者:Joachim Brendborg,Panagiotis Karras,Andreas Pavlogiannis,Asger Ullersted Rasmussen,Josef Tkadlec
备注:11 pages, 6 figures, to appear at AAAI 2022
摘要:The Moran process is a classic stochastic process that models invasion
dynamics on graphs. A single "mutant" (e.g., a new opinion, strain, social
trait etc.) invades a population of residents spread over the nodes of a graph.
The mutant fitness advantage $\delta\geq 0$ determines how aggressively mutants
propagate to their neighbors. The quantity of interest is the fixation
probability, i.e., the probability that the initial mutant eventually takes
over the whole population. However, in realistic settings, the invading mutant
has an advantage only in certain locations. E.g., a bacterial mutation allowing
for lactose metabolism only confers an advantage on places where dairy products
are present. In this paper we introduce the positional Moran process, a natural
generalization in which the mutant fitness advantage is only realized on
specific nodes called active nodes. The associated optimization problem is
fixation maximization: given a budget $k$, choose a set of $k$ active nodes
that maximize the fixation probability of the invading mutant. We show that the
problem is NP-hard, while the optimization function is not submodular, thus
indicating strong computational hardness. Then we focus on two natural limits.
In the limit of $\delta\to\infty$ (strong selection), although the problem
remains NP-hard, the optimization function becomes submodular and thus admits a
constant-factor approximation using a simple greedy algorithm. In the limit of
$\delta\to 0$ (weak selection), we show that in $O(m^\omega)$ time we can
obtain a tight approximation, where $m$ is the number of edges and $\omega$ is
the matrix-multiplication exponent. Finally, we present an experimental
evaluation of the new algorithms together with some proposed heuristics.
【99】 Source Code Anti-Plagiarism: a C# Implementation using the Routing Approach
标题:源代码反剽窃:使用路由方法的C#实现
作者:Fabrizio d'Amore,Lorenzo Zarfati
摘要:Despite the approaches proposed so far, software plagiarism is still a
problem which has not been solved entirely yet. The approach introduced
throughout this paper is about a source code anti-plagiarism technique which
aims at rendering the source code incomprehensible to a possible plagiarist and
at the same time preventing source code modifications. The proposal is based on
the concept of Router and makes use of both symmetric encryption and
cryptographic hashing functions to provide such guarantees.
【100】 An Input-to-State Safety Approach to Anomaly-Resilient Parabolic PDEs: Application to Cyber-Physical Battery Modules
作者:Tanushree Roy,Ashley Knichel,Satadru Dey
摘要:Distributed Parameter Cyber-Physical Systems (DPCPSs), modelled by Partial
Differential Equations (PDEs), are increasingly vulnerable to anomalies such as
physical faults as well as cyber-attacks. This motivates the need for
strategies towards anomaly-resilient control of these systems. Although anomaly
detection and diagnostics in PDE systems have received considerable attention
in existing literature, fault-tolerant or anomaly-resilient control for PDEs
remains relatively under-explored. However, given the vulnerabilities of these
systems against anomalies, it is essential that the control systems possess
resilience against these disruptions. In this context, we explore a Practical
Input-to-Safety (pISSf) based control design approach for a class of DPCPSs
modelled by linear Parabolic PDEs. Specifically, we develop a design framework
for anomaly-resilient control for this class of system with both safety and
stability guarantees based on control Lyapunov functional and control barrier
functional. To illustrate our methodology, we apply our strategy to design a
thermal-anomaly resilient boundary coolant control system for a cyber-physical
battery module. Several simulation studies are done to show the efficacy of our
method under anomalies such as mechanical battery degradation and cyber-attack
mediated overdischarge.
【101】 Multi-modal data fusion of Voice and EMG data for Robotic Control
标题:机器人控制中语音和肌电数据的多模态数据融合
作者:Tauheed Khan Mohd,Jackson Carvalho,Ahmad Y Javaid
摘要:Wearable electronic equipment is constantly evolving and is increasing the
integration of humans with technology. Available in various forms, these
flexible and bendable devices sense and can measure the physiological and
muscular changes in the human body and may use those signals to machine
control. The MYO gesture band, one such device, captures Electromyography data
(EMG) using myoelectric signals and translates them to be used as input signals
through some predefined gestures. Use of this device in a multi-modal
environment will not only increase the possible types of work that can be
accomplished with the help of such device, but it will also help in improving
the accuracy of the tasks performed. This paper addresses the fusion of input
modalities such as speech and myoelectric signals captured through a microphone
and MYO band, respectively, to control a robotic arm. Experimental results
obtained as well as their accuracies for performance analysis are also
presented.
【102】 Detecting Anomalies using Overlapping Electrical Measurements in Smart Power Grids
标题:利用重叠电测量检测智能电网中的异常
作者:Sina Sontowski,Nigel Lawrence,Deepjyoti Deka,Maanak Gupta
摘要:As cyber-attacks against critical infrastructure become more frequent, it is
increasingly important to be able to rapidly identify and respond to these
threats. This work investigates two independent systems with overlapping
electrical measurements with the goal to more rapidly identify anomalies. The
independent systems include HIST, a SCADA historian, and ION, an automatic
meter reading system (AMR). While prior research has explored the benefits of
fusing measurements, the possibility of overlapping measurements from an
existing electrical system has not been investigated. To that end, we explore
the potential benefits of combining overlapping measurements both to improve
the speed/accuracy of anomaly detection and to provide additional validation of
the collected measurements. In this paper, we show that merging overlapping
measurements provide a more holistic picture of the observed systems. By
applying Dynamic Time Warping more anomalies were found -- specifically, an
average of 349 times more anomalies, when considering anomalies from both
overlapping measurements. When merging the overlapping measurements, a percent
change of anomalies of up to 785\% can be achieved compared to a non-merge of
the data as reflected by experimental results.
【103】 Consistent Style Transfer
标题:一致的风格传递
作者:Xuan Luo,Zhen Han,Lingkang Yang,Lingling Zhang
备注:10 pages, 11 figures
摘要:Recently, attentional arbitrary style transfer methods have been proposed to
achieve fine-grained results, which manipulates the point-wise similarity
between content and style features for stylization. However, the attention
mechanism based on feature points ignores the feature multi-manifold
distribution, where each feature manifold corresponds to a semantic region in
the image. Consequently, a uniform content semantic region is rendered by
highly different patterns from various style semantic regions, producing
inconsistent stylization results with visual artifacts. We proposed the
progressive attentional manifold alignment (PAMA) to alleviate this problem,
which repeatedly applies attention operations and space-aware interpolations.
The attention operation rearranges style features dynamically according to the
spatial distribution of content features. This makes the content and style
manifolds correspond on the feature map. Then the space-aware interpolation
adaptively interpolates between the corresponding content and style manifolds
to increase their similarity. By gradually aligning the content manifolds to
style manifolds, the proposed PAMA achieves state-of-the-art performance while
avoiding the inconsistency of semantic regions. Codes are available at
https://github.com/computer-vision2022/PAMA.
【104】 Large-scale protein-protein post-translational modification extraction with distant supervision and confidence calibrated BioBERT
标题:具有远程监控和置信度校准的大规模蛋白质翻译后修饰提取
作者:Aparna Elangovan,Yuan Li,Douglas E. V. Pires,Melissa J. Davis,Karin Verspoor
备注:None
摘要:Protein-protein interactions (PPIs) are critical to normal cellular function
and are related to many disease pathways. However, only 4% of PPIs are
annotated with PTMs in biological knowledge databases such as IntAct, mainly
performed through manual curation, which is neither time nor cost-effective. We
use the IntAct PPI database to create a distant supervised dataset annotated
with interacting protein pairs, their corresponding PTM type, and associated
abstracts from the PubMed database. We train an ensemble of BioBERT models -
dubbed PPI-BioBERT-x10 to improve confidence calibration. We extend the use of
ensemble average confidence approach with confidence variation to counteract
the effects of class imbalance to extract high confidence predictions. The
PPI-BioBERT-x10 model evaluated on the test set resulted in a modest F1-micro
41.3 (P =5 8.1, R = 32.1). However, by combining high confidence and low
variation to identify high quality predictions, tuning the predictions for
precision, we retained 19% of the test predictions with 100% precision. We
evaluated PPI-BioBERT-x10 on 18 million PubMed abstracts and extracted 1.6
million (546507 unique PTM-PPI triplets) PTM-PPI predictions, and filter ~ 5700
(4584 unique) high confidence predictions. Of the 5700, human evaluation on a
small randomly sampled subset shows that the precision drops to 33.7% despite
confidence calibration and highlights the challenges of generalisability beyond
the test set even with confidence calibration. We circumvent the problem by
only including predictions associated with multiple papers, improving the
precision to 58.8%. In this work, we highlight the benefits and challenges of
deep learning-based text mining in practice, and the need for increased
emphasis on confidence calibration to facilitate human curation efforts.
【105】 PIEEG: Turn a Raspberry Pi into a Brain-Computer-Interface to measure biosignals
标题:PIEEG:把树莓PI变成脑机接口来测量生物信号
作者:Ildar Rakhmatulin,Sebastian Volkl
摘要:This paper presents an inexpensive, high-precision, but at the same time,
easy-to-maintain PIEEG board to convert a RaspberryPI to a Brain-computer
interface. This shield allows measuring and processing eight real-time EEG
(Electroencephalography) signals. We used the most popular programming
languages - C, C++ and Python to read the signals, recorded by the device . The
process of reading EEG signals was demonstrated as completely and clearly as
possible. This device can be easily used for machine learning enthusiasts to
create projects for controlling robots and mechanical limbs using the power of
thought. We will post use cases on GitHub
(https://github.com/Ildaron/EEGwithRaspberryPI) for controlling a robotic
machine, unmanned aerial vehicle, and more just using the power of thought.
【106】 Predicting Trust Using Automated Assessment of Multivariate Interactional Synchrony
标题:基于多变量交互同步性自动评估的信任预测
作者:Adrien Meynard,Gayan Seneviratna,Elliot Doyle,Joyanne Becker,Hau-Tieng Wu,Jana Schaich Borg
摘要:Diverse disciplines are interested in how the coordination of interacting
agents' movements, emotions, and physiology over time impacts social behavior.
Here, we describe a new multivariate procedure for automating the investigation
of this kind of behaviorally-relevant "interactional synchrony", and introduce
a novel interactional synchrony measure based on features of dynamic time
warping (DTW) paths. We demonstrate that our DTW path-based measure of
interactional synchrony between facial action units of two people interacting
freely in a natural social interaction can be used to predict how much trust
they will display in a subsequent Trust Game. We also show that our approach
outperforms univariate head movement models, models that consider participants'
facial action units independently, and models that use previously proposed
synchrony or similarity measures. The insights of this work can be applied to
any research question that aims to quantify the temporal coordination of
multiple signals over time, but has immediate applications in psychology,
medicine, and robotics.
【107】 Nonlocal Kernel Network (NKN): a Stable and Resolution-Independent Deep Neural Network
标题:非局部核网络(NKN):一种稳定的与分辨率无关的深度神经网络
作者:Huaiqian You,Yue Yu,Marta D'Elia,Tian Gao,Stewart Silling
摘要:Neural operators have recently become popular tools for designing solution
maps between function spaces in the form of neural networks. Differently from
classical scientific machine learning approaches that learn parameters of a
known partial differential equation (PDE) for a single instance of the input
parameters at a fixed resolution, neural operators approximate the solution map
of a family of PDEs. Despite their success, the uses of neural operators are so
far restricted to relatively shallow neural networks and confined to learning
hidden governing laws. In this work, we propose a novel nonlocal neural
operator, which we refer to as nonlocal kernel network (NKN), that is
resolution independent, characterized by deep neural networks, and capable of
handling a variety of tasks such as learning governing equations and
classifying images. Our NKN stems from the interpretation of the neural network
as a discrete nonlocal diffusion reaction equation that, in the limit of
infinite layers, is equivalent to a parabolic nonlocal equation, whose
stability is analyzed via nonlocal vector calculus. The resemblance with
integral forms of neural operators allows NKNs to capture long-range
dependencies in the feature space, while the continuous treatment of
node-to-node interactions makes NKNs resolution independent. The resemblance
with neural ODEs, reinterpreted in a nonlocal sense, and the stable network
dynamics between layers allow for generalization of NKN's optimal parameters
from shallow to deep networks. This fact enables the use of shallow-to-deep
initialization techniques. Our tests show that NKNs outperform baseline methods
in both learning governing equations and image classification tasks and
generalize well to different resolutions and depths.
【108】 On the Prevalence, Impact, and Evolution of SQL Code Smells in Data-Intensive Systems
标题:SQL代码嗅觉在数据密集型系统中的流行、影响和演变
作者:Biruk Asmare Muse,Mohammad Masudur Rahman,Csaba Nagy,Anthony Cleve,Foutse Khomh,Giuliano Antoniol
备注:None
摘要:Code smells indicate software design problems that harm software quality.
Data-intensive systems that frequently access databases often suffer from SQL
code smells besides the traditional smells. While there have been extensive
studies on traditional code smells, recently, there has been a growing interest
in SQL code smells. In this paper, we conduct an empirical study to investigate
the prevalence and evolution of SQL code smells in open-source, data-intensive
systems. We collected 150 projects and examined both traditional and SQL code
smells in these projects. Our investigation delivers several important
findings. First, SQL code smells are indeed prevalent in data-intensive
software systems. Second, SQL code smells have a weak co-occurrence with
traditional code smells. Third, SQL code smells have a weaker association with
bugs than that of traditional code smells. Fourth, SQL code smells are more
likely to be introduced at the beginning of the project lifetime and likely to
be left in the code without a fix, compared to traditional code smells.
Overall, our results show that SQL code smells are indeed prevalent and
persistent in the studied data-intensive software systems. Developers should be
aware of these smells and consider detecting and refactoring SQL code smells
and traditional code smells separately, using dedicated tools.
【109】 Towards Industry 5.0: Intelligent Reflecting Surface (IRS) in Smart Manufacturing
标题:走向行业5.0:智能制造中的智能反射面(IRS)
作者:Md. Noor-A-Rahim,Fadhil Firyaguna,Jobish John,M. Omar Khyam,Dirk Pesch,Eddie Armstrong,Holger Claussen,H. Vincent Poor
摘要:Intelligent Reflecting Surface (IRS) is expected to become a key enabling
technology for 6G wireless communication networks as they can significantly
improve the wireless network's performance, creating a controllable radio
environment in preferred directions. The vision for Industry 5.0 is for close
cooperation between humans and machines, requiring ultra-reliability and low
latency communications (URLLC). IRS is expected to play a crucial role in
realizing wireless URLLC for Industry 5.0. In this paper, we first provide an
overview of IRS technology and then conceptualize the potential for IRS
implementation in a smart manufacturing environment to support the emergence of
Industry 5.0 with a series of applications. Finally, to stimulate future
research in this area, we discuss the strength, open challenges, maturity, and
enhancing areas of the IRS technology in modern smart manufacturing.
【110】 Explainable deep learning for insights in El Nino and river flows
标题:可解释的深度学习,以洞察厄尔尼诺现象和河流流动
作者:Yumin Liu,Kate Duffy,Jennifer G. Dy,Auroop R. Ganguly
摘要:The El Nino Southern Oscillation (ENSO) is a semi-periodic fluctuation in sea
surface temperature (SST) over the tropical central and eastern Pacific Ocean
that influences interannual variability in regional hydrology across the world
through long-range dependence or teleconnections. Recent research has
demonstrated the value of Deep Learning (DL) methods for improving ENSO
prediction as well as Complex Networks (CN) for understanding teleconnections.
However, gaps in predictive understanding of ENSO-driven river flows include
the black box nature of DL, the use of simple ENSO indices to describe a
complex phenomenon and translating DL-based ENSO predictions to river flow
predictions. Here we show that eXplainable DL (XDL) methods, based on saliency
maps, can extract interpretable predictive information contained in global SST
and discover novel SST information regions and dependence structures relevant
for river flows which, in tandem with climate network constructions, enable
improved predictive understanding. Our results reveal additional information
content in global SST beyond ENSO indices, develop new understanding of how
SSTs influence river flows, and generate improved river flow predictions with
uncertainties. Observations, reanalysis data, and earth system model
simulations are used to demonstrate the value of the XDL-CN based methods for
future interannual and decadal scale climate projections.
【111】 The E-Intelligence System
标题:电子情报系统
作者:Vibhor Gautam,Vikalp Shishodia
摘要:Electronic Intelligence (ELINT), often known as E-Intelligence, is
intelligence obtained through electronic sensors. Other than personal
communications, ELINT intelligence is usually obtained. The goal is usually to
determine a target's capabilities, such as radar placement. Active or passive
sensors can be employed to collect data. A provided signal is analyzed and
contrasted to collected data for recognized signal types. The information may
be stored if the signal type is detected; it can be classed as new if no match
is found. ELINT collects and categorizes data. In a military setting (and
others that have adopted the usage, such as a business), intelligence helps an
organization make decisions that can provide them a strategic advantage over
the competition. The term "intel" is frequently shortened. The two main
subfields of signals intelligence (SIGINT) are ELINT and Communications
Intelligence (COMINT). The US Department of Defense specifies the
terminologies, and intelligence communities use the categories of data reviewed
worldwide.
【112】 An Incremental Learning Approach to Automatically Recognize Pulmonary Diseases from the Multi-vendor Chest Radiographs
标题:一种增量学习方法自动识别多厂商胸片中的肺部疾病
作者:Mehreen Sirshar,Taimur Hassan,Muhammad Usman Akram,Shoab Ahmed Khan
备注:None
摘要:Pulmonary diseases can cause severe respiratory problems, leading to sudden
death if not treated timely. Many researchers have utilized deep learning
systems to diagnose pulmonary disorders using chest X-rays (CXRs). However,
such systems require exhaustive training efforts on large-scale data to
effectively diagnose chest abnormalities. Furthermore, procuring such
large-scale data is often infeasible and impractical, especially for rare
diseases. With the recent advances in incremental learning, researchers have
periodically tuned deep neural networks to learn different classification tasks
with few training examples. Although, such systems can resist catastrophic
forgetting, they treat the knowledge representations independently of each
other, and this limits their classification performance. Also, to the best of
our knowledge, there is no incremental learning-driven image diagnostic
framework that is specifically designed to screen pulmonary disorders from the
CXRs. To address this, we present a novel framework that can learn to screen
different chest abnormalities incrementally. In addition to this, the proposed
framework is penalized through an incremental learning loss function that
infers Bayesian theory to recognize structural and semantic inter-dependencies
between incrementally learned knowledge representations to diagnose the
pulmonary diseases effectively, regardless of the scanner specifications. We
tested the proposed framework on five public CXR datasets containing different
chest abnormalities, where it outperformed various state-of-the-art system
through various metrics.
【113】 AugmentedPCA: A Python Package of Supervised and Adversarial Linear Factor Models
标题:增强的PCA:一个Python包,包含监督和对抗的线性因素模型
作者:William E. Carson IV,Austin Talbot,David Carlson
备注:NeurIPS 2021 (Learning Meaningful Representations of Life Workshop)
摘要:Deep autoencoders are often extended with a supervised or adversarial loss to
learn latent representations with desirable properties, such as greater
predictivity of labels and outcomes or fairness with respects to a sensitive
variable. Despite the ubiquity of supervised and adversarial deep latent factor
models, these methods should demonstrate improvement over simpler linear
approaches to be preferred in practice. This necessitates a reproducible linear
analog that still adheres to an augmenting supervised or adversarial objective.
We address this methodological gap by presenting methods that augment the
principal component analysis (PCA) objective with either a supervised or an
adversarial objective and provide analytic and reproducible solutions. We
implement these methods in an open-source Python package, AugmentedPCA, that
can produce excellent real-world baselines. We demonstrate the utility of these
factor models on an open-source, RNA-seq cancer gene expression dataset,
showing that augmenting with a supervised objective results in improved
downstream classification performance, produces principal components with
greater class fidelity, and facilitates identification of genes aligned with
the principal axes of data variance with implications to development of
specific types of cancer.
【114】 Machine-learning-based arc selection for constrained shortest path problems in column generation
标题:基于机器学习的列生成约束最短路径问题的圆弧选择
作者:Mouad Morabit,Guy Desaulniers,Andrea Lodi
摘要:Column generation is an iterative method used to solve a variety of
optimization problems. It decomposes the problem into two parts: a master
problem, and one or more pricing problems (PP). The total computing time taken
by the method is divided between these two parts. In routing or scheduling
applications, the problems are mostly defined on a network, and the PP is
usually an NP-hard shortest path problem with resource constraints. In this
work, we propose a new heuristic pricing algorithm based on machine learning.
By taking advantage of the data collected during previous executions, the
objective is to reduce the size of the network and accelerate the PP, keeping
only the arcs that have a high chance to be part of the linear relaxation
solution. The method has been applied to two specific problems: the vehicle and
crew scheduling problem in public transit and the vehicle routing problem with
time windows. Reductions in computational time of up to 40% can be obtained.
【115】 TOWER-Complete Problems in Contraction-Free Substructural Logics
标题:无收缩子结构逻辑中的塔式完备性问题
作者:Hiromi Tanaka
备注:Draft
摘要:We investigate the computational complexity of a family of substructural
logics with exchange and weakening but without contraction. With the aid of the
techniques provided by Lazi\'c and Schmitz (2015), we show that the
deducibility problem for full Lambek calculus with exchange and weakening
($\mathbf{FL}_{\mathbf{ew}}$) is TOWER-complete, where TOWER is one of the
non-elementary complexity classes introduced by Schmitz (2016). The same
complexity result holds even for deducibility in BCK-logic, i.e., the
implicational fragment of $\mathbf{FL}_{\mathbf{ew}}$. We furthermore show the
TOWER-completeness of the provability problem for elementary affine logic,
which was proved to be decidable by Dal Lago and Martini (2004).
【116】 Deep Domain Adversarial Adaptation for Photon-efficient Imaging Based on Spatiotemporal Inception Network
标题:基于时空初始网络的光子有效成像的深域对抗性自适应
作者:Yiwei Chen,Gongxin Yao,Yong Liu,Yu Pan
摘要:In single-photon LiDAR, photon-efficient imaging captures the 3D structure of
a scene by only several detected signal photons per pixel. The existing deep
learning models for this task are trained on simulated datasets, which poses
the domain shift challenge when applied to realistic scenarios. In this paper,
we propose a spatiotemporal inception network (STIN) for photon-efficient
imaging, which is able to precisely predict the depth from a sparse and
high-noise photon counting histogram by fully exploiting spatial and temporal
information. Then the domain adversarial adaptation frameworks, including
domain-adversarial neural network and adversarial discriminative domain
adaptation, are effectively applied to STIN to alleviate the domain shift
problem for realistic applications. Comprehensive experiments on the simulated
data generated from the NYU~v2 and the Middlebury datasets demonstrate that
STIN outperforms the state-of-the-art models at low signal-to-background ratios
from 2:10 to 2:100. Moreover, experimental results on the real-world dataset
captured by the single-photon imaging prototype show that the STIN with domain
adversarial training achieves better generalization performance compared with
the state-of-the-arts as well as the baseline STIN trained by simulated data.
【117】 Negative Evidence Matters in Interpretable Histology Image Classification
标题:负证据在可解释组织学图像分类中的作用
作者:Soufiane Belharbi,Marco Pedersoli,Ismail Ben Ayed,Luke McCaffrey,Eric Granger
备注:10 figures, under review
摘要:Using only global annotations such as the image class labels,
weakly-supervised learning methods allow CNN classifiers to jointly classify an
image, and yield the regions of interest associated with the predicted class.
However, without any guidance at the pixel level, such methods may yield
inaccurate regions. This problem is known to be more challenging with histology
images than with natural ones, since objects are less salient, structures have
more variations, and foreground and background regions have stronger
similarities. Therefore, methods in computer vision literature for visual
interpretation of CNNs may not directly apply. In this work, we propose a
simple yet efficient method based on a composite loss function that leverages
information from the fully negative samples. Our new loss function contains two
complementary terms: the first exploits positive evidence collected from the
CNN classifier, while the second leverages the fully negative samples from the
training dataset. In particular, we equip a pre-trained classifier with a
decoder that allows refining the regions of interest. The same classifier is
exploited to collect both the positive and negative evidence at the pixel level
to train the decoder. This enables to take advantages of the fully negative
samples that occurs naturally in the data, without any additional supervision
signals and using only the image class as supervision. Compared to several
recent related methods, over the public benchmark GlaS for colon cancer and a
Camelyon16 patch-based benchmark for breast cancer using three different
backbones, we show the substantial improvements introduced by our method. Our
results shows the benefits of using both negative and positive evidence, ie,
the one obtained from a classifier and the one naturally available in datasets.
We provide an ablation study of both terms. Our code is publicly available.
【118】 Applications of Signature Methods to Market Anomaly Detection
标题:签名方法在市场异常检测中的应用
作者:Erdinc Akyildirim,Matteo Gambara,Josef Teichmann,Syang Zhou
摘要:Anomaly detection is the process of identifying abnormal instances or events
in data sets which deviate from the norm significantly. In this study, we
propose a signatures based machine learning algorithm to detect rare or
unexpected items in a given data set of time series type. We present
applications of signature or randomized signature as feature extractors for
anomaly detection algorithms; additionally we provide an easy, representation
theoretic justification for the construction of randomized signatures. Our
first application is based on synthetic data and aims at distinguishing between
real and fake trajectories of stock prices, which are indistinguishable by
visual inspection. We also show a real life application by using transaction
data from the cryptocurrency market. In this case, we are able to identify pump
and dump attempts organized on social networks with F1 scores up to 88% by
means of our unsupervised learning algorithm, thus achieving results that are
close to the state-of-the-art in the field based on supervised learning.
【119】 Optimality in Noisy Importance Sampling
标题:噪声重要抽样中的最优性
作者:Fernando Llorente,Luca Martino,Jesse Read,David Delgado-Gómez
摘要:In this work, we analyze the noisy importance sampling (IS), i.e., IS working
with noisy evaluations of the target density. We present the general framework
and derive optimal proposal densities for noisy IS estimators. The optimal
proposals incorporate the information of the variance of the noisy
realizations, proposing points in regions where the noise power is higher. We
also compare the use of the optimal proposals with previous optimality
approaches considered in a noisy IS framework.
【120】 Effect of Prior-based Losses on Segmentation Performance: A Benchmark
标题:基于先前损失对分割性能的影响:一个基准
作者:Rosana {EL JURDI},Caroline Petitjean,Veronika Cheplygina,Paul Honeine,Fahed Abdallah
备注:To be submitted to SPIE: Journal of Medical Imaging
摘要:Today, deep convolutional neural networks (CNNs) have demonstrated
state-of-the-art performance for medical image segmentation, on various imaging
modalities and tasks. Despite early success, segmentation networks may still
generate anatomically aberrant segmentations, with holes or inaccuracies near
the object boundaries. To enforce anatomical plausibility, recent research
studies have focused on incorporating prior knowledge such as object shape or
boundary, as constraints in the loss function. Prior integrated could be
low-level referring to reformulated representations extracted from the
ground-truth segmentations, or high-level representing external medical
information such as the organ's shape or size. Over the past few years,
prior-based losses exhibited a rising interest in the research field since they
allow integration of expert knowledge while still being architecture-agnostic.
However, given the diversity of prior-based losses on different medical imaging
challenges and tasks, it has become hard to identify what loss works best for
which dataset. In this paper, we establish a benchmark of recent prior-based
losses for medical image segmentation. The main objective is to provide
intuition onto which losses to choose given a particular task or dataset. To
this end, four low-level and high-level prior-based losses are selected. The
considered losses are validated on 8 different datasets from a variety of
medical image segmentation challenges including the Decathlon, the ISLES and
the WMH challenge. Results show that whereas low-level prior-based losses can
guarantee an increase in performance over the Dice loss baseline regardless of
the dataset characteristics, high-level prior-based losses can increase
anatomical plausibility as per data characteristics.
【121】 Auto-Weighted Layer Representation Based View Synthesis Distortion Estimation for 3-D Video Coding
标题:基于自动加权分层表示的三维视频编码视图合成失真估计
作者:Jian Jin,Xingxing Zhang,Lili Meng,Weisi Lin,Jie Liang,Huaxiang Zhang,Yao Zhao
摘要:Recently, various view synthesis distortion estimation models have been
studied to better serve for 3-D video coding. However, they can hardly model
the relationship quantitatively among different levels of depth changes,
texture degeneration, and the view synthesis distortion (VSD), which is crucial
for rate-distortion optimization and rate allocation. In this paper, an
auto-weighted layer representation based view synthesis distortion estimation
model is developed. Firstly, the sub-VSD (S-VSD) is defined according to the
level of depth changes and their associated texture degeneration. After that, a
set of theoretical derivations demonstrate that the VSD can be approximately
decomposed into the S-VSDs multiplied by their associated weights. To obtain
the S-VSDs, a layer-based representation of S-VSD is developed, where all the
pixels with the same level of depth changes are represented with a layer to
enable efficient S-VSD calculation at the layer level. Meanwhile, a nonlinear
mapping function is learnt to accurately represent the relationship between the
VSD and S-VSDs, automatically providing weights for S-VSDs during the VSD
estimation. To learn such function, a dataset of VSD and its associated S-VSDs
are built. Experimental results show that the VSD can be accurately estimated
with the weights learnt by the nonlinear mapping function once its associated
S-VSDs are available. The proposed method outperforms the relevant
state-of-the-art methods in both accuracy and efficiency. The dataset and
source code of the proposed method will be available at
https://github.com/jianjin008/.
【122】 Amplitude SAR Imagery Splicing Localization
标题:幅度SAR图像拼接定位
作者:Edoardo Daniele Cannas,Nicolò Bonettini,Sara Mandelli,Paolo Bestagini,Stefano Tubaro
摘要:Synthetic Aperture Radar (SAR) images are a valuable asset for a wide variety
of tasks. In the last few years, many websites have been offering them for free
in the form of easy to manage products, favoring their widespread diffusion and
research work in the SAR field. The drawback of these opportunities is that
such images might be exposed to forgeries and manipulations by malicious users,
raising new concerns about their integrity and trustworthiness. Up to now, the
multimedia forensics literature has proposed various techniques to localize
manipulations in natural photographs, but the integrity assessment of SAR
images was never investigated. This task poses new challenges, since SAR images
are generated with a processing chain completely different from that of natural
photographs. This implies that many forensics methods developed for natural
images are not guaranteed to succeed. In this paper, we investigate the problem
of amplitude SAR imagery splicing localization. Our goal is to localize regions
of an amplitude SAR image that have been copied and pasted from another image,
possibly undergoing some kind of editing in the process. To do so, we leverage
a Convolutional Neural Network (CNN) to extract a fingerprint highlighting
inconsistencies in the processing traces of the analyzed input. Then, we
examine this fingerprint to produce a binary tampering mask indicating the
pixel region under splicing attack. Results show that our proposed method,
tailored to the nature of SAR signals, provides better performances than
state-of-the-art forensic tools developed for natural images.
【123】 Model-Free Nonlinear Feedback Optimization
作者:Zhiyu He,Saverio Bolognani,Jianping He,Florian Dörfler,Xinping Guan
摘要:Feedback optimization is a control paradigm that enables physical systems to
autonomously reach efficient operating points. Its central idea is to
interconnect optimization iterations in closed-loop with the physical plant.
Since iterative gradient-based methods are extensively used to achieve
optimality, feedback optimization controllers typically require the knowledge
of the steady-state sensitivity of the plant, which may not be easily
accessible in some applications. In contrast, in this paper we develop a
model-free feedback controller for efficient steady-state operation of general
dynamical systems. The proposed design consists in updating control inputs via
gradient estimates constructed from evaluations of the nonconvex objective at
the current input and at the measured output. We study the dynamic
interconnection of the proposed iterative controller with a stable nonlinear
discrete-time plant. For this setup, we characterize the optimality and the
stability of the closed-loop behavior as functions of the problem dimension,
the number of iterations, and the rate of convergence of the physical plant. To
handle general constraints that affect multiple inputs, we enhance the
controller with Frank-Wolfe type updates.
【124】 Investigation of the Relationship Between Localization Accuracy and Sensor Array
标题:定位精度与传感器阵列关系的研究
作者:Y Li
摘要:The magnetic localization method has been widely studied, which is mainly
based on the accurate mapping of the magnetic field generated by magnetic
sources. Many factors affect localization accuracy in the experiment.
Therefore, this paper tends to study the relationship between localization
accuracy and sensor array with different experiments. This system uses a small
magnet as the magnetic source, and the mathematical model of the magnetic
positioning system is established based on the magnetic dipole model to
estimate the magnetic field. The Levenberg-Marquardt algorithm was used to
construct a magnetic positioning objective function for comparison experiments.
Experimental results show:When the sensor is evenly distributed around the
magnet, the positioning accuracy is higher than other layout of the sensor
array, the average localization error is 0.47mm and the average orientation
error is 0.92 degree.
【125】 The Green's function of the Lax-Wendroff and Beam-Warming schemes
作者:Jean-François Coulombel
摘要:We prove a sharp uniform generalized Gaussian bound for the Green's function
of the Lax-Wendroff and Beam-Warming schemes. Our bound highlights the spatial
region that leads to the well-known (rather weak) instability of these schemes
in the maximum norm. We also recover uniform bounds in the maximum norm when
these schemes are applied to initial data of bounded variation.
【126】 Cross-Modality Deep Feature Learning for Brain Tumor Segmentation
标题:跨模态深度特征学习在脑肿瘤分割中的应用
作者:Dingwen Zhang,Guohai Huang,Qiang Zhang,Jungong Han,Junwei Han,Yizhou Yu
备注:published on Pattern Recognition 2021
摘要:Recent advances in machine learning and prevalence of digital medical images
have opened up an opportunity to address the challenging brain tumor
segmentation (BTS) task by using deep convolutional neural networks. However,
different from the RGB image data that are very widespread, the medical image
data used in brain tumor segmentation are relatively scarce in terms of the
data scale but contain the richer information in terms of the modality
property. To this end, this paper proposes a novel cross-modality deep feature
learning framework to segment brain tumors from the multi-modality MRI data.
The core idea is to mine rich patterns across the multi-modality data to make
up for the insufficient data scale. The proposed cross-modality deep feature
learning framework consists of two learning processes: the cross-modality
feature transition (CMFT) process and the cross-modality feature fusion (CMFF)
process, which aims at learning rich feature representations by transiting
knowledge across different modality data and fusing knowledge from different
modality data, respectively. Comprehensive experiments are conducted on the
BraTS benchmarks, which show that the proposed cross-modality deep feature
learning framework can effectively improve the brain tumor segmentation
performance when compared with the baseline methods and state-of-the-art
methods.
【127】 Projective Embedding of Dynamical Systems: uniform mean field equations
标题:动力系统的射影嵌入:一致平均场方程
作者:Francesco Caravelli,Fabio L. Traversa,Michele Bonnin,Fabrizio Bonani
备注:45 pages; one column; 10 figures;
摘要:We study embeddings of continuous dynamical systems in larger dimensions via
projector operators. We call this technique PEDS, projective embedding of
dynamical systems, as the stable fixed point of the dynamics are recovered via
projection from the higher dimensional space. In this paper we provide a
general definition and prove that for a particular type of projector operator
of rank-1, the uniform mean field projector, the equations of motion become a
mean field approximation of the dynamical system. While in general the
embedding depends on a specified variable ordering, the same is not true for
the uniform mean field projector. In addition, we prove that the original
stable fixed points remain stable fixed points of the dynamics, saddle points
remain saddle, but unstable fixed points become saddles.
【128】 Multiresolution Fully Convolutional Networks to detect Clouds and Snow through Optical Satellite Images
标题:利用光学卫星图像探测云雪的多分辨率全卷积网络
作者:Debvrat Varshney,Claudio Persello,Prasun Kumar Gupta,Bhaskar Ramachandra Nikam
摘要:Clouds and snow have similar spectral features in the visible and
near-infrared (VNIR) range and are thus difficult to distinguish from each
other in high resolution VNIR images. We address this issue by introducing a
shortwave-infrared (SWIR) band where clouds are highly reflective, and snow is
absorptive. As SWIR is typically of a lower resolution compared to VNIR, this
study proposes a multiresolution fully convolutional neural network (FCN) that
can effectively detect clouds and snow in VNIR images. We fuse the
multiresolution bands within a deep FCN and perform semantic segmentation at
the higher, VNIR resolution. Such a fusion-based classifier, trained in an
end-to-end manner, achieved 94.31% overall accuracy and an F1 score of 97.67%
for clouds on Resourcesat-2 data captured over the state of Uttarakhand, India.
These scores were found to be 30% higher than a Random Forest classifier, and
10% higher than a standalone single-resolution FCN. Apart from being useful for
cloud detection purposes, the study also highlights the potential of
convolutional neural networks for multi-sensor fusion problems.
【129】 Bayesian Online Change Point Detection for Baseline Shifts
标题:基线偏移的贝叶斯在线变化点检测
作者:Ginga Yoshizawa
备注:None
摘要:In time series data analysis, detecting change points on a real-time basis
(online) is of great interest in many areas, such as finance, environmental
monitoring, and medicine. One promising means to achieve this is the Bayesian
online change point detection (BOCPD) algorithm, which has been successfully
adopted in particular cases in which the time series of interest has a fixed
baseline. However, we have found that the algorithm struggles when the baseline
irreversibly shifts from its initial state. This is because with the original
BOCPD algorithm, the sensitivity with which a change point can be detected is
degraded if the data points are fluctuating at locations relatively far from
the original baseline. In this paper, we not only extend the original BOCPD
algorithm to be applicable to a time series whose baseline is constantly
shifting toward unknown values but also visualize why the proposed extension
works. To demonstrate the efficacy of the proposed algorithm compared to the
original one, we examine these algorithms on two real-world data sets and six
synthetic data sets.
【130】 RestoreDet: Degradation Equivariant Representation for Object Detection in Low Resolution Images
标题:RestoreDet:低分辨率图像目标检测的退化等变表示
作者:Ziteng Cui,Yingying Zhu,Lin Gu,Guo-Jun Qi,Xiaoxiao Li,Peng Gao,Zenghui Zhang,Tatsuya Harada
备注:11 pages, 3figures
摘要:Image restoration algorithms such as super resolution (SR) are indispensable
pre-processing modules for object detection in degraded images. However, most
of these algorithms assume the degradation is fixed and known a priori. When
the real degradation is unknown or differs from assumption, both the
pre-processing module and the consequent high-level task such as object
detection would fail. Here, we propose a novel framework, RestoreDet, to detect
objects in degraded low resolution images. RestoreDet utilizes the downsampling
degradation as a kind of transformation for self-supervised signals to explore
the equivariant representation against various resolutions and other
degradation conditions. Specifically, we learn this intrinsic visual structure
by encoding and decoding the degradation transformation from a pair of original
and randomly degraded images. The framework could further take the advantage of
advanced SR architectures with an arbitrary resolution restoring decoder to
reconstruct the original correspondence from the degraded input image. Both the
representation learning and object detection are optimized jointly in an
end-to-end training fashion. RestoreDet is a generic framework that could be
implemented on any mainstream object detection architectures. The extensive
experiment shows that our framework based on CenterNet has achieved superior
performance compared with existing methods when facing variant degradation
situations. Our code would be released soon.
【131】 Stochastic Saddle Point Problems with Decision-Dependent Distributions
标题:决策相关分布的随机鞍点问题
作者:Killian Wood,Emiliano Dall'Anese
摘要:This paper focuses on stochastic saddle point problems with
decision-dependent distributions in both the static and time-varying settings.
These are problems whose objective is the expected value of a stochastic payoff
function, where random variables are drawn from a distribution induced by a
distributional map. For general distributional maps, the problem of finding
saddle points is in general computationally burdensome, even if the
distribution is known. To enable a tractable solution approach, we introduce
the notion of equilibrium points -- which are saddle points for the stationary
stochastic minimax problem that they induce -- and provide conditions for their
existence and uniqueness. We demonstrate that the distance between the two
classes of solutions is bounded provided that the objective has a
strongly-convex-strongly-concave payoff and Lipschitz continuous distributional
map. We develop deterministic and stochastic primal-dual algorithms and
demonstrate their convergence to the equilibrium point. In particular, by
modeling errors emerging from a stochastic gradient estimator as sub-Weibull
random variables, we provide error bounds in expectation and in high
probability that hold for each iteration; moreover, we show convergence to a
neighborhood in expectation and almost surely. Finally, we investigate a
condition on the distributional map -- which we call opposing mixture dominance
-- that ensures the objective is strongly-convex-strongly-concave. Under this
assumption, we show that primal-dual algorithms converge to the saddle points
in a similar fashion.
【132】 Electric Vehicle Routing Problem with Spatio-temporal Varying Electricity Price and Incentive-aware Customers
作者:Canqi Yao,Shibo Chen,Mauro Salazar,Zaiyue Yang
备注:Submitted to IEEE TSG. arXiv admin note: substantial text overlap with arXiv:2110.06441
摘要:This paper investigates the optimization problem of a fleet of electric
vehicles (EVs) serving a set of time-specified customers, where the operator
needs to optimize routing and charging problem jointly for each EV. In
particular, regarding to the spatio-temporal varying electricity price, we
consider incentive-aware customers and propose that the operator offers
monetary incentives to exchange time flexibility of customers. In this manner,
a win-win situation is achievable since time flexibility enables the fleet
operator to obtain a routing and charging schedule with lower cost, whilst the
customers receives monetary compensation. Specifically, we first devise a
bi-level model whereby the fleet operator optimizes the routing and charging
schedule jointly with a monetary incentive to reimburse the delivery time
flexibility experienced by the customers. At the same time, the customers
choose the optimal time flexibility by minimizing its own cost. Second, we
tackle the complexity resulting from the bi-level and nonlinear problem with an
equivalent transformation method. Eventually, we reformulate the problem as a
single-level optimization problem, which later is solved by proposed Benders
dual decomposition method holding a faster convergence rate than the
generalized Benders decomposition method. To evaluate the effectiveness of our
framework and proposed Benders dual decomposition algorithm, we carry out
extensive numerical experiments using VRP-REP data from Belgium.
【133】 Generalized quantum similarity learning
标题:广义量子相似学习
作者:Santosh Kumar Radha,Casey Jao
摘要:The similarity between objects is significant in a broad range of areas.
While similarity can be measured using off-the-shelf distance functions, they
may fail to capture the inherent meaning of similarity, which tends to depend
on the underlying data and task. Moreover, conventional distance functions
limit the space of similarity measures to be symmetric and do not directly
allow comparing objects from different spaces. We propose using quantum
networks (GQSim) for learning task-dependent (a)symmetric similarity between
data that need not have the same dimensionality. We analyze the properties of
such similarity function analytically (for a simple case) and numerically (for
a complex case) and showthat these similarity measures can extract salient
features of the data. We also demonstrate that the similarity measure derived
using this technique is $(\epsilon,\gamma,\tau)$-good, resulting in
theoretically guaranteed performance. Finally, we conclude by applying this
technique for three relevant applications - Classification, Graph Completion,
Generative modeling.
【134】 A three-dimensional dual-domain deep network for high-pitch and sparse helical CT reconstruction
标题:用于高螺距稀疏螺旋CT重建的三维双域深度网络
作者:Wei Wang,Xiang-Gen Xia,Chuanjiang He,Zemin Ren,Jian Lu
备注:13 pages, 5 figures
摘要:In this paper, we propose a new GPU implementation of the Katsevich algorithm
for helical CT reconstruction. Our implementation divides the sinograms and
reconstructs the CT images pitch by pitch. By utilizing the periodic properties
of the parameters of the Katsevich algorithm, our method only needs to
calculate these parameters once for all the pitches and so has lower GPU-memory
burdens and is very suitable for deep learning. By embedding our implementation
into the network, we propose an end-to-end deep network for the high pitch
helical CT reconstruction with sparse detectors. Since our network utilizes the
features extracted from both sinograms and CT images, it can simultaneously
reduce the streak artifacts caused by the sparsity of sinograms and preserve
fine details in the CT images. Experiments show that our network outperforms
the related methods both in subjective and objective evaluations.
【135】 A Theoretical Framework of Almost Hyperparameter-free Hyperparameter Selection Methods for Offline Policy Evaluation
标题:用于离线政策评估的几乎无超参数超参数选择方法的理论框架
作者:Kohei Miyaguchi
备注:AAAI22-AI4DO (workshop)
摘要:We are concerned with the problem of hyperparameter selection of offline
policy evaluation (OPE). OPE is a key component of offline reinforcement
learning, which is a core technology for data-driven decision optimization
without environment simulators. However, the current state-of-the-art OPE
methods are not hyperparameter-free, which undermines their utility in
real-life applications. We address this issue by introducing a new approximate
hyperparameter selection (AHS) framework for OPE, which defines a notion of
optimality (called selection criteria) in a quantitative and interpretable
manner without hyperparameters. We then derive four AHS methods each of which
has different characteristics such as convergence rate and time complexity.
Finally, we verify effectiveness and limitation of these methods with a
preliminary experiment.
【136】 Local and Global Convergence of General Burer-Monteiro Tensor Optimizations
标题:广义布里-蒙泰罗张量优化问题的局部收敛性和全局收敛性
作者:Shuang Li,Qiuwei Li
摘要:Tensor optimization is crucial to massive machine learning and signal
processing tasks. In this paper, we consider tensor optimization with a convex
and well-conditioned objective function and reformulate it into a nonconvex
optimization using the Burer-Monteiro type parameterization. We analyze the
local convergence of applying vanilla gradient descent to the factored
formulation and establish a local regularity condition under mild assumptions.
We also provide a linear convergence analysis of the gradient descent algorithm
started in a neighborhood of the true tensor factors. Complementary to the
local analysis, this work also characterizes the global geometry of the best
rank-one tensor approximation problem and demonstrates that for orthogonally
decomposable tensors the problem has no spurious local minima and all saddle
points are strict except for the one at zero which is a third-order saddle
point.
【137】 Persistent Homology for Breast Tumor Classification using Mammogram Scans
标题:使用乳腺X线扫描实现乳腺肿瘤分类的持久同源性
作者:Aras Asaad,Dashti Ali,Taban Majeed,Rasber Rashid
备注:10 pages
摘要:An Important tool in the field topological data analysis is known as
persistent Homology (PH) which is used to encode abstract representation of the
homology of data at different resolutions in the form of persistence diagram
(PD). In this work we build more than one PD representation of a single image
based on a landmark selection method, known as local binary patterns, that
encode different types of local textures from images. We employed different PD
vectorizations using persistence landscapes, persistence images, persistence
binning (Betti Curve) and statistics. We tested the effectiveness of proposed
landmark based PH on two publicly available breast abnormality detection
datasets using mammogram scans. Sensitivity of landmark based PH obtained is
over 90% in both datasets for the detection of abnormal breast scans. Finally,
experimental results give new insights on using different types of PD
vectorizations which help in utilising PH in conjunction with machine learning
classifiers.
【138】 Strategic Storage Investment in Electricity Markets
作者:Dongwei Zhao,Mehdi Jafari,Audun Botterud,Apurba Sakti
摘要:Arbitrage is one important revenue source for energy storage in electricity
markets. However, a large amount of storage in the market will impact the
energy price and reduce potential revenues. This can lead to strategic
behaviors of profit-seeking storage investors. To study the investors'
strategic storage investments, we formulate a non-cooperative game between
competing investors. Each investor decides the storage investment over a long
investment horizon, and operates the storage for arbitrage revenues in the
daily electricity market. Different investors can deploy storage with different
characteristics. Their decisions are coupled due to the market price that is
determined by all the investors' decisions. We use market data from California
ISO to characterize the storage impact on the market price, based on which we
establish a centralized optimization problem to compute the market equilibrium.
We show that an increasing number of investors will increase the market
competition, which reduces investors' profits but increases the total invested
storage capacity. Furthermore, we find that a slight increase in the storage
efficiency (e.g., increased charge and discharge efficiency) can significantly
improve an investor's profit share in the market.
【139】 GCWSNet: Generalized Consistent Weighted Sampling for Scalable and Accurate Training of Neural Networks
标题:GCWSNet:可扩展精确训练神经网络的广义一致加权抽样
作者:Ping Li,Weijie Zhao
摘要:We develop the "generalized consistent weighted sampling" (GCWS) for hashing
the "powered-GMM" (pGMM) kernel (with a tuning parameter $p$). It turns out
that GCWS provides a numerically stable scheme for applying power
transformation on the original data, regardless of the magnitude of $p$ and the
data. The power transformation is often effective for boosting the performance,
in many cases considerably so. We feed the hashed data to neural networks on a
variety of public classification datasets and name our method ``GCWSNet''. Our
extensive experiments show that GCWSNet often improves the classification
accuracy. Furthermore, it is evident from the experiments that GCWSNet
converges substantially faster. In fact, GCWS often reaches a reasonable
accuracy with merely (less than) one epoch of the training process. This
property is much desired because many applications, such as advertisement
click-through rate (CTR) prediction models, or data streams (i.e., data seen
only once), often train just one epoch. Another beneficial side effect is that
the computations of the first layer of the neural networks become additions
instead of multiplications because the input data become binary (and highly
sparse).
Empirical comparisons with (normalized) random Fourier features (NRFF) are
provided. We also propose to reduce the model size of GCWSNet by count-sketch
and develop the theory for analyzing the impact of using count-sketch on the
accuracy of GCWS. Our analysis shows that an ``8-bit'' strategy should work
well in that we can always apply an 8-bit count-sketch hashing on the output of
GCWS hashing without hurting the accuracy much. There are many other ways to
take advantage of GCWS when training deep neural networks. For example, one can
apply GCWS on the outputs of the last layer to boost the accuracy of trained
deep neural networks.
【140】 Well-Conditioned Linear Minimum Mean Square Error Estimation
标题:良态线性最小均方误差估计
作者:Edwin K. P. Chong
摘要:Computing linear minimum mean square error (LMMSE) filters is often ill
conditioned, suggesting that unconstrained minimization of the mean square
error is an inadequate principle for filter design. To address this, we first
develop a unifying framework for studying constrained LMMSE estimation
problems. Using this framework, we expose an important structural property of
all constrained LMMSE filters and show that they all involve an inherent
preconditioning step. This parameterizes all such filters only by their
preconditioners. Moreover, each filters is invariant to invertible linear
transformations of its preconditioner. We then clarify that merely constraining
the rank of the filters, leading to the well-known low-rank Wiener filter, does
not suitably address the problem of ill conditioning. Instead, we use a
constraint that explicitly requires solutions to be well conditioned in a
certain specific sense. We introduce two well-conditioned estimators and
evaluate their mean-squared-error performance. We show these two estimators
converge to the standard LMMSE filter as their truncated-power ratio converges
to zero, but more slowly than the low-rank Wiener filter in terms of scaling
law. This exposes the price for being well conditioned. We also show
quantitative results with historical VIX data to illustrate the performance of
our two well-conditioned estimators.
【141】 PWM2Vec: An Efficient Embedding Approach for Viral Host Specification from Coronavirus Spike Sequences
标题:PWM2Vec:一种有效的从冠状病毒棘突序列中嵌入病毒宿主的方法
作者:Sarwan Ali,Babatunde Bello,Prakash Chourasia,Ria Thazhe Punathil,Yijing Zhou,Murray Patterson
摘要:COVID-19 pandemic, is still unknown and is an important open question. There
are speculations that bats are a possible origin. Likewise, there are many
closely related (corona-) viruses, such as SARS, which was found to be
transmitted through civets. The study of the different hosts which can be
potential carriers and transmitters of deadly viruses to humans is crucial to
understanding, mitigating and preventing current and future pandemics. In
coronaviruses, the surface (S) protein, or spike protein, is an important part
of determining host specificity since it is the point of contact between the
virus and the host cell membrane. In this paper, we classify the hosts of over
five thousand coronaviruses from their spike protein sequences, segregating
them into clusters of distinct hosts among avians, bats, camels, swines, humans
and weasels, to name a few. We propose a feature embedding based on the
well-known position-weight matrix (PWM), which we call PWM2Vec, and use to
generate feature vectors from the spike protein sequences of these
coronaviruses. While our embedding is inspired by the success of PWMs in
biological applications such as determining protein function, or identifying
transcription factor binding sites, we are the first (to the best of our
knowledge) to use PWMs in the context of host classification from viral
sequences to generate a fixed-length feature vector representation. The results
on the real world data show that in using PWM2Vec, we are able to perform
comparably well as compared to baseline models. We also measure the importance
of different amino acids using information gain to show the amino acids which
are important for predicting the host of a given coronavirus.
【142】 Surveying 5G Techno-Economic Research to Inform the Evaluation of 6G Wireless Technologies
标题:调查5G技术-经济研究为6G无线技术评估提供信息
作者:Edward J. Oughton,William Lehr
摘要:Techno-economic assessment is a fundamental technique engineers use for
evaluating new communications technologies. However, despite the
techno-economics of the fifth cellular generation (5G) being an active research
area, it is surprising there are few comprehensive evaluations of this growing
literature. With mobile network operators deploying 5G across their networks,
it is therefore an opportune time to appraise current accomplishments and
review the state-of-the-art. Such insight can inform the flurry of 6G research
papers currently underway and help engineers in their mission to provide
affordable high-capacity, low-latency broadband connectivity, globally. The
survey discusses emerging trends from the 5G techno-economic literature and
makes six key recommendations for the design and standardization of Next
Generation 6G wireless technologies.
【143】 A Keypoint Detection and Description Network Based on the Vessel Structure for Multi-Modal Retinal Image Registration
标题:一种基于血管结构的多模态视网膜图像配准关键点检测与描述网络
作者:Aline Sindel,Bettina Hohberger,Sebastian Fassihi Dehcordi,Christian Mardin,Robert Lämmer,Andreas Maier,Vincent Christlein
备注:6 pages, 4 figures, 1 table, accepted to BVM 2022
摘要:Ophthalmological imaging utilizes different imaging systems, such as color
fundus, infrared, fluorescein angiography, optical coherence tomography (OCT)
or OCT angiography. Multiple images with different modalities or acquisition
times are often analyzed for the diagnosis of retinal diseases. Automatically
aligning the vessel structures in the images by means of multi-modal
registration can support the ophthalmologists in their work. Our method uses a
convolutional neural network to extract features of the vessel structure in
multi-modal retinal images. We jointly train a keypoint detection and
description network on small patches using a classification and a cross-modal
descriptor loss function and apply the network to the full image size in the
test phase. Our method demonstrates the best registration performance on our
and a public multi-modal dataset in comparison to competing methods.
【144】 Comprehensive RF Dataset Collection and Release: A Deep Learning-Based Device Fingerprinting Use Case
标题:全面的射频数据集收集和发布:基于深度学习的设备指纹识别使用案例
作者:Abdurrahman Elmaghbub,Bechir Hamdaoui
备注:This paper has been presented in IEEE GLOBECOM Workshop 2021
摘要:Deep learning-based RF fingerprinting has recently been recognized as a
potential solution for enabling newly emerging wireless network applications,
such as spectrum access policy enforcement, automated network device
authentication, and unauthorized network access monitoring and control. Real,
comprehensive RF datasets are now needed more than ever to enable the study,
assessment, and validation of newly developed RF fingerprinting approaches. In
this paper, we present and release a large-scale RF fingerprinting dataset,
collected from 25 different LoRa-enabled IoT transmitting devices using USRP
B210 receivers. Our dataset consists of a large number of SigMF-compliant
binary files representing the I/Q time-domain samples and their corresponding
FFT-based files of LoRa transmissions. This dataset provides a comprehensive
set of essential experimental scenarios, considering both indoor and outdoor
environments and various network deployments and configurations, such as the
distance between the transmitters and the receiver, the configuration of the
considered LoRa modulation, the physical location of the conducted experiment,
and the receiver hardware used for training and testing the neural network
models.
【145】 3D Intracranial Aneurysm Classification and Segmentation via Unsupervised Dual-branch Learning
标题:基于无监督双分支学习的三维颅内动脉瘤分类与分割
作者:Di Shao,Xuequan Lu,Xiao Liu
备注:submitted for review (contact: xuequan.lu@deakin.edu.au)
摘要:Intracranial aneurysms are common nowadays and how to detect them
intelligently is of great significance in digital health. While most existing
deep learning research focused on medical images in a supervised way, we
introduce an unsupervised method for the detection of intracranial aneurysms
based on 3D point cloud data. In particular, our method consists of two stages:
unsupervised pre-training and downstream tasks. As for the former, the main
idea is to pair each point cloud with its jittered counterpart and maximise
their correspondence. Then we design a dual-branch contrastive network with an
encoder for each branch and a subsequent common projection head. As for the
latter, we design simple networks for supervised classification and
segmentation training. Experiments on the public dataset (IntrA) show that our
unsupervised method achieves comparable or even better performance than some
state-of-the-art supervised techniques, and it is most prominent in the
detection of aneurysmal vessels. Experiments on the ModelNet40 also show that
our method achieves the accuracy of 90.79\% which outperforms existing
state-of-the-art unsupervised models.
【146】 Inferring Turbulent Parameters via Machine Learning
标题:基于机器学习的湍流参数推断
作者:Michele Buzzicotti,Fabio Bonaccorso,Luca Biferale
摘要:We design a machine learning technique to solve the general problem of
inferring physical parameters from the observation of turbulent flows, a
relevant exercise in many theoretical and applied fields, from engineering to
earth observation and astrophysics. Our approach is to train the machine
learning system to regress the rotation frequency of the flow's reference
frame, from the observation of the flow's velocity amplitude on a 2d plane
extracted from the 3d domain. The machine learning approach consists of a Deep
Convolutional Neural Network (DCNN) of the same kind developed in computer
vision. The training and validation datasets are produced by means of fully
resolved direct numerical simulations. This study shows interesting results
from two different points of view. From the machine learning point of view it
shows the potential of DCNN, reaching good results on such a particularly
complex problem that goes well outside the limits of human vision. Second, from
the physics point of view, it provides an example on how machine learning can
be exploited in data analysis to infer information that would be inaccessible
otherwise. Indeed, by comparing DCNN with the other possible Bayesian
approaches, we find that DCNN yields to a much higher inference accuracy in all
the examined cases.
机器翻译,仅供参考
点击“阅读原文”获取带摘要的学术速递