new

Get trending papers in your email inbox!

Subscribe

Daily Papers

byAK and the research community

Jan 8

IXPE Observation of the Low-Synchrotron Peaked Blazar S4 0954+65 During An Optical-X-ray Flare

The X-ray polarization observations made possible with the Imaging X-ray Polarimetry Explorer (IXPE) offer new ways of probing high-energy emission processes in astrophysical jets from blazars. Here we report on the first X-ray polarization observation of the blazar S4 0954+65 in a high optical and X-ray state. During our multi-wavelength campaign on the source, we detected an optical flare whose peak coincided with the peak of an X-ray flare. This optical-X-ray flare most likely took place in a feature moving along the parsec-scale jet, imaged at 43 GHz by the Very Long Baseline Array. The 43 GHz polarization angle of the moving component underwent a rotation near the time of the flare. In the optical band, prior to the IXPE observation, we measured the polarization angle to be aligned with the jet axis. In contrast, during the optical flare the optical polarization angle was perpendicular to the jet axis; after the flare, it reverted to being parallel to the jet axis. Due to the smooth behavior of the optical polarization angle during the flare, we favor shocks as the main acceleration mechanism. We also infer that the ambient magnetic field lines in the jet were parallel to the jet position angle. The average degree of optical polarization during the IXPE observation was (14.3pm4.1)%. Despite the flare, we only detected an upper limit of 14% (at 3sigma level) on the X-ray polarization degree; although a reasonable assumption on the X-ray polarization angle results in an upper limit of 8.8% (3sigma). We model the spectral energy distribution (SED) and spectral polarization distribution (SPD) of S4 0954+65 with leptonic (synchrotron self-Compton) and hadronic (proton and pair synchrotron) models. The constraints we obtain with our combined multi-wavelength polarization observations and SED modeling tentatively disfavor hadronic models for the X-ray emission in S4 0954+65.

  • 137 authors
·
Nov 25, 2024

The DESI PRObabilistic Value-Added Bright Galaxy Survey (PROVABGS) Mock Challenge

The PRObabilistic Value-Added Bright Galaxy Survey (PROVABGS) catalog will provide measurements of galaxy properties, such as stellar mass (M_*), star formation rate ({rm SFR}), stellar metallicity (Z_{rm MW}), and stellar age (t_{rm age, MW}), for >10 million galaxies of the DESI Bright Galaxy Survey. Full posterior distributions of the galaxy properties will be inferred using state-of-the-art Bayesian spectral energy distribution (SED) modeling of DESI spectroscopy and Legacy Surveys photometry. In this work, we present the SED model, Bayesian inference framework, and methodology of PROVABGS. Furthermore, we apply the PROVABGS SED modeling on realistic synthetic DESI spectra and photometry, constructed using the L-GALAXIES semi-analytic model. We compare the inferred galaxy properties to the true galaxy properties of the simulation using a hierarchical Bayesian framework to quantify accuracy and precision. Overall, we accurately infer the true M_*, {rm SFR}, Z_{rm MW}, and t_{rm age, MW} of the simulated galaxies. However, the priors on galaxy properties induced by the SED model have a significant impact on the posteriors. They impose a {rm SFR}{>}10^{-1} M_odot/{rm yr} lower bound on {rm SFR}, a {sim}0.3 dex bias on log Z_{rm MW} for galaxies with low spectral signal-to-noise, and t_{rm age, MW} < 8,{rm Gyr} upper bound on stellar age. This work also demonstrates that a joint analysis of spectra and photometry significantly improves the constraints on galaxy properties over photometry alone and is necessary to mitigate the impact of the priors. With the methodology presented and validated in this work, PROVABGS will maximize information extracted from DESI observations and provide a probabilistic value-added galaxy catalog that will extend current galaxy studies to new regimes and unlock cutting-edge probabilistic analyses.

  • 19 authors
·
Feb 3, 2022

An X-ray Significantly Variable, Luminous, Type 2 Quasar at z = 2.99 with a Massive Host Galaxy

We present a comprehensive X-ray analysis and spectral energy distribution (SED) fitting of WISEA J171419.96+602724.6, an extremely luminous type 2 quasar at z = 2.99. The source was suggested as a candidate Compton-thick (column density N_{rm H}>1.5 times 10^{24} cm^{-2}) quasar by a short XMM-Newton observation in 2011. We recently observed the source with deep NuSTAR and XMM-Newton exposures in 2021 and found that the source has a lower obscuration of N_{rm H}sim5 times 10^{22} cm^{-2} with an about four times lower flux. The two epochs of observations suggested that the source was significantly variable in X-ray obscuration, flux, and intrinsic luminosity at 2-3~sigma in less than 2.5 years (in the source rest frame). We performed SED fitting of this source using CIGALE thanks to its great availability of multiwavelength data (from hard X-rays to radio). The source is very luminous with a bolometric luminosity of L_{rm BOL}sim 2.5 times 10^{47} erg s^{-1}. Its host galaxy has a huge star formation rate (SFR) of sim1280 Solar mass yr^{-1} and a huge stellar mass of sim1.1 times 10^{12} Solar mass. The correlation between the SFR and stellar mass of this source is consistent with what was measured in the high-z quasars. It is also consistent with what was measured in the main-sequence star-forming galaxies, suggesting that the presence of the active nucleus in our target does not enhance or suppress the SFR of its host galaxy. The source is an Infrared hyper-luminous, obscured galaxy with significant amount of hot dust in its torus and shares many similar properties with hot, dust obscured galaxies.

  • 11 authors
·
Sep 3, 2024

Unlocking the radio-gamma spectrum of the pulsar wind nebula around PSR J1124-5916 in SNR G292.0+1.8

We present the first detection of GeV gamma-ray emission potentially associated with the pulsar wind nebula (PWN) hosted by the young core-collapse supernova remnant G292.0+1.8, based on a detailed time-resolved analysis of Fermi-LAT data. By isolating the unpulsed component from the dominant magnetospheric radiation of PSR~J1124-5916, we successfully disentangle a candidate nebular emission in the GeV range, characterise its morphology and extract its spectrum. This identification places G292.0+1.8 among the few systems in which the pulsar and PWN contributions have been spectrally resolved at high energies, offering new insight into their respective emission mechanisms. We characterise the gamma-ray spectrum of the pulsar and model the broadband spectral energy distribution (SED) of the PWN using radio, X-ray, and GeV data. The emission is well described by a single electron population with two spectral breaks: one intrinsic to the injection spectrum and another produced by synchrotron cooling in a magnetic field of sim15~muG. Notably, the inferred magnetic field and the low TeV flux of the nebula resemble those of 3C~58, suggesting that similar low-field environments can arise in young PWNe. The high-energy portion of the SED is now tightly constrained by our GeV detection and existing TeV upper limits. Compared to our model, earlier predictions tend to underpredict the gamma-ray flux, while others that succeed in reproducing the GeV component often overpredict the TeV emission. This mismatch underscores the challenges in modelling particle acceleration and radiation processes in young PWNe and establishes G292.0+1.8 as a valuable benchmark for testing and refining such models.

  • 3 authors
·
Oct 27, 2025

Cosmic Evolution Early Release Science (CEERS) survey: The colour evolution of galaxies in the distant Universe

The wavelength-coverage and sensitivity of JWST now enables us to probe the rest-frame UV - optical spectral energy distributions (SEDs) of galaxies at high-redshift (z>4). From these SEDs it is, in principle, through SED fitting possible to infer key physical properties, including stellar masses, star formation rates, and dust attenuation. These in turn can be compared with the predictions of galaxy formation simulations allowing us to validate and refine the incorporated physics. However, the inference of physical properties, particularly from photometry alone, can lead to large uncertainties and potential biases. Instead, it is now possible, and common, for simulations to be forward-modelled to yield synthetic observations that can be compared directly to real observations. In this work, we measure the JWST broadband fluxes and colours of a robust sample of 5<z<10 galaxies using the Cosmic Evolution Early Release Science (CEERS) Survey. We then analyse predictions from a variety of models using the same methodology and compare the NIRCam/F277W magnitude distribution and NIRCam colours with observations. We find that the predicted and observed magnitude distributions are similar, at least at 5<z<8. At z>8 the distributions differ somewhat, though our observed sample size is small and thus susceptible to statistical fluctuations. Likewise, the predicted and observed colour evolution show broad agreement, at least at 5<z<8. There is however some disagreement between the observed and modelled strength of the strong line contribution. In particular all the models fails to reproduce the F410M-F444W colour at z>8, though, again, the sample size is small here.

  • 23 authors
·
Nov 14, 2023

First Light And Reionisation Epoch Simulations (FLARES) XVI: Size Evolution of Massive Dusty Galaxies at Cosmic Dawn from UV to IR

We use the First Light And Reionisation Epoch Simulations (FLARES) to study the evolution of the rest-frame ultraviolet (UV) and far-infrared (FIR) sizes for a statistical sample of massive (gtrsim10^{9}M_{odot}) high redshift galaxies (z in [5,10]). Galaxies are post-processed using the SKIRT radiative transfer code, to self-consistently obtain the full spectral energy distribution and surface brightness distribution. We create mock observations of the galaxies for the Near Infrared Camera (NIRCam) to study the rest-frame UV 1500 xC5 morphology. We also generate mock rest-frame FIR (50 mum) photometry and mock ALMA (158 mum) (0.01"-0.03" and approx0.3" angular resolution) observations to study the dust-continuum. We find the effect of dust on observed sizes reduces with increasing wavelength from the UV to optical (sim0.6 times the UV at 0.4mum), with no evolution in FIR sizes. Observed sizes vary within 0.4-1.2 times the intrinsic sizes at different signal to noise ratios (SNR = 5-20) across redshifts. The effect of PSF and noise makes bright structures prominent, whereas fainter regions blend with noise, leading to an underestimation (factor of 0.4-0.8) of sizes at SNR=5. At SNR=15-20, the underestimation reduces (factor of 0.6-0.9) at z=5-8 but due to PSF, at z=9-10, bright cores are dominant, resulting in an overestimation (factor of 1.0-1.2). For ALMA, low resolution sizes are effected by noise which acts as extended emission. The size evolution in UV broadly agrees with current observational samples and other simulations. This work is one of the first to analyse the panchromatic sizes of a statistically significant sample of simulated high-redshift galaxies, complementing a growing body of research highlighting the importance of conducting an equivalent comparison between observed galaxies and their simulated counterparts in the early Universe.

  • 12 authors
·
Aug 20, 2024

1FLAT: a Firmamento-based catalog of AGN in Fermi-LAT high Galactic latitude γ-ray sources

We present a systematic reassessment of 5,062 high-Galactic latitude gamma-ray sources from the Fermi-LAT 4FGL-DR4 catalog using Firmamento, a web-based platform for multi-frequency source discovery and analysis. Our goal is to provide an independent evaluation of LAT gamma-ray source associations through alternative spectral and spatial methods that combine recent and legacy survey data, supplemented by human supervision of spectral energy distributions (SEDs), source morphology, flux variability, and template-based comparisons. Firmamento confirms the 4FGL-DR4 and 4LAC-DR3 counterparts or unassociated sources in 4,493 cases (88.8%), demonstrating the robustness of both approaches. Beyond this general agreement, we identify 421 new blazar counterparts among previously unassociated sources, thereby reducing the fraction of unidentified extragalactic Fermi-LAT sources from 25% to 17%. In addition, in 64 cases we find alternative blazar associations, while in 49 instances we do not confirm the 4FGL-DR4 association. For all confirmed blazar counterparts we provide homogeneous estimates of synchrotron peak frequency and peak flux using machine-learning and template-based methods; these agree with 4LAC-DR3 values in most cases, though significant discrepancies appear for a few dozen sources, often due to improved X-ray coverage. The primary outcome of this work is the 1st Firmamento LAT AGN table (1FLAT), made publicly available through the Firmamento platform (https://firmamento.nyuad.nyu.edu), where all related multi-wavelength data and images are available. The project involved extensive manual validation and benefited from the active participation of graduate and undergraduate students, highlighting the platform's value for both research and education.

  • 18 authors
·
Oct 8, 2025

The Binary Fraction of Red Supergiants in the Magellanic Clouds

Red supergiants (RSGs), as the descendants of OB-type stars and the progenitors of supernovae, provide crucial insights into the evolution of massive stars, particularly in binary systems. Previous studies show that the binary fraction of RSGs (approx 15% - 40%) is significantly lower than that of their predecessors (approx 50% - 70%). In this work, we investigate the binary fraction of RSGs with the recently selected largest samples of 4695 and 2097 RSGs in the Large Magellanic Cloud (LMC) and Small Magellanic Cloud (SMC), respectively. The binary system with a hot companion (O-, B- and A-type star) is identified by detecting the ultraviolet (UV) excess in the observed spectral energy distribution (SED) ranging from ultraviolet to mid-infrared after subtracting the model SED of RSG since RSGs are very weak in the UV band. It is found that the lower limit of binarity is 30.2% pm 0.7% and 32.2% pm 1% in the LMC and SMC, respectively. If the sample is limited to luminous RSGs with log L/L_{odot} > 4.0, the binary fraction becomes 26.6% pm 1.1% and 26.4% pm 1.7% in the LMC and SMC, respectively. The derived binary fraction is valid in the range of sim 2.3 < log P / [d] < sim 8. Our study suggests that roughly one-third of massive stars host a third companion within sim 30,000 AU. In addition, 15 RSGs are also identified as binary via HST/STIS spectra, and a handful of the binaries identified by the SED fitting are confirmed by their light curve and radial velocity dispersion. The stellar parameters of the companions, i.e. T_{eff}, R, L and log g, are calculated by model fitting.

  • 3 authors
·
Apr 4, 2025

Euclid Quick Data Release (Q1) Exploring galaxy properties with a multi-modal foundation model

Modern astronomical surveys, such as the Euclid mission, produce high-dimensional, multi-modal data sets that include imaging and spectroscopic information for millions of galaxies. These data serve as an ideal benchmark for large, pre-trained multi-modal models, which can leverage vast amounts of unlabelled data. In this work, we present the first exploration of Euclid data with AstroPT, an autoregressive multi-modal foundation model trained on approximately 300 000 optical and infrared Euclid images and spectral energy distributions (SEDs) from the first Euclid Quick Data Release. We compare self-supervised pre-training with baseline fully supervised training across several tasks: galaxy morphology classification; redshift estimation; similarity searches; and outlier detection. Our results show that: (a) AstroPT embeddings are highly informative, correlating with morphology and effectively isolating outliers; (b) including infrared data helps to isolate stars, but degrades the identification of edge-on galaxies, which are better captured by optical images; (c) simple fine-tuning of these embeddings for photometric redshift and stellar mass estimation outperforms a fully supervised approach, even when using only 1% of the training labels; and (d) incorporating SED data into AstroPT via a straightforward multi-modal token-chaining method improves photo-z predictions, and allow us to identify potentially more interesting anomalies (such as ringed or interacting galaxies) compared to a model pre-trained solely on imaging data.

  • 324 authors
·
Mar 19, 2025

EPOCHS Paper V. The dependence of galaxy formation on galaxy structure at z < 7 from JWST observations

We measure the broad impact of galaxy structure on galaxy formation by examining the ongoing star formation and integrated star formation history as revealed through the stellar masses of galaxies at z < 7 based on JWST CEERS data from the Extended Groth Strip (EGS). Using the morphological catalog of 3965 visually classified JWST galaxies from Ferreira et al. (2023), we investigate the evolution of stars, and when they form, as a function of morphological type as well as galaxies classified as passive and starburst through spectral energy distributions. Although disk galaxies dominate the structures of galaxies at z < 7, we find that these disks are in general either `passive', or on the main-sequence of star formation, and do not contain a large population of starburst galaxies. We also find no significant correlation between morphological type and the star formation rate or colours of galaxies at z < 7. In fact, we find that the morphologically classified `spheroids' tend to be blue and are not found to be predominately passive systems at z > 1.5. We also find that the stellar mass function for disk galaxies does not evolve significantly during this time, whereas other galaxy types, such as the peculiar population, evolve dramatically, declining at lower redshifts. This indicates that massive peculiars are more common at higher redshifts. We further find that up to z sim 7, the specific star formation rate (sSFR) does not vary with visual morphology, but strongly depends on stellar mass and internal galaxy mass density. This demonstrates that at early epochs galaxy assembly is a mass-driven, rather than a morphologically-driven, process. Quenching of star formation is therefore a mass-dominated process throughout the universe's history, likely due to the presence of supermassive black holes.

  • 14 authors
·
May 1, 2024

Understanding and Mitigating Distribution Shifts For Machine Learning Force Fields

Machine Learning Force Fields (MLFFs) are a promising alternative to expensive ab initio quantum mechanical molecular simulations. Given the diversity of chemical spaces that are of interest and the cost of generating new data, it is important to understand how MLFFs generalize beyond their training distributions. In order to characterize and better understand distribution shifts in MLFFs, we conduct diagnostic experiments on chemical datasets, revealing common shifts that pose significant challenges, even for large foundation models trained on extensive data. Based on these observations, we hypothesize that current supervised training methods inadequately regularize MLFFs, resulting in overfitting and learning poor representations of out-of-distribution systems. We then propose two new methods as initial steps for mitigating distribution shifts for MLFFs. Our methods focus on test-time refinement strategies that incur minimal computational cost and do not use expensive ab initio reference labels. The first strategy, based on spectral graph theory, modifies the edges of test graphs to align with graph structures seen during training. Our second strategy improves representations for out-of-distribution systems at test-time by taking gradient steps using an auxiliary objective, such as a cheap physical prior. Our test-time refinement strategies significantly reduce errors on out-of-distribution systems, suggesting that MLFFs are capable of and can move towards modeling diverse chemical spaces, but are not being effectively trained to do so. Our experiments establish clear benchmarks for evaluating the generalization capabilities of the next generation of MLFFs. Our code is available at https://tkreiman.github.io/projects/mlff_distribution_shifts/.

  • 2 authors
·
Mar 11, 2025 3

Probing X-ray Timing and Spectral Variability in the Blazar PKS 2155-304 Over a Decade of XMM-Newton Observations

Blazars, a class of active galactic nuclei (AGN) powered by supermassive black holes, are known for their remarkable variability across multiple timescales and wavelengths. With advancements in both ground- and space-based telescopes, our understanding of AGN central engines has significantly improved. However, the mechanisms driving this variability remain elusive, and continue to fascinate both theorists and observers alike. The primary objective of this study is to constrain the X-ray variability properties of the TeV blazar PKS 2155-304. We conduct a comprehensive X-ray spectral and timing analysis, focusing on both long-term and intra-day variability. This analysis uses data from 22 epochs of XMM-Newton EPIC-pn observations, collected over 15 years (2000-2014). To investigate the variability of the source, we applied both timing and spectral analyses. For the timing analysis, we estimated fractional variability, variability amplitude, minimum variability timescales, flux distribution, and power spectral density (PSD). In the spectral analysis, we fitted the X-ray spectra using power-law, log-parabola, and broken power-law (BPL) models to determine the best-fitting parameters. Additionally, we studied the hardness ratio (HR). We observed moderate intra-day variability in most of the light curves. Seven out of the twenty-two observations showed a clear bimodal flux distribution, indicating the presence of two distinct flux states. Our analysis revealed a variable power-law PSD slope. Most HR plots did not show significant variation with flux, except for one observation (OBSID 0124930501), where HR increased with flux (Count/s). The fitted X-ray spectra favored the BPL model for the majority of observations. The findings of this work shed light on the intraday variability of blazars, providing insights into the non-thermal jet processes that drive the observed flux variations.

  • 8 authors
·
Oct 2, 2024

Modeling Eye Gaze Velocity Trajectories using GANs with Spectral Loss for Enhanced Fidelity

Accurate modeling of eye gaze dynamics is essential for advancement in human-computer interaction, neurological diagnostics, and cognitive research. Traditional generative models like Markov models often fail to capture the complex temporal dependencies and distributional nuance inherent in eye gaze trajectories data. This study introduces a GAN framework employing LSTM and CNN generators and discriminators to generate high-fidelity synthetic eye gaze velocity trajectories. We conducted a comprehensive evaluation of four GAN architectures: CNN-CNN, LSTM-CNN, CNN-LSTM, and LSTM-LSTM trained under two conditions: using only adversarial loss and using a weighted combination of adversarial and spectral losses. Our findings reveal that the LSTM-CNN architecture trained with this new loss function exhibits the closest alignment to the real data distribution, effectively capturing both the distribution tails and the intricate temporal dependencies. The inclusion of spectral regularization significantly enhances the GANs ability to replicate the spectral characteristics of eye gaze movements, leading to a more stable learning process and improved data fidelity. Comparative analysis with an HMM optimized to four hidden states further highlights the advantages of the LSTM-CNN GAN. Statistical metrics show that the HMM-generated data significantly diverges from the real data in terms of mean, standard deviation, skewness, and kurtosis. In contrast, the LSTM-CNN model closely matches the real data across these statistics, affirming its capacity to model the complexity of eye gaze dynamics effectively. These results position the spectrally regularized LSTM-CNN GAN as a robust tool for generating synthetic eye gaze velocity data with high fidelity.

  • 6 authors
·
Dec 5, 2024

DDS2M: Self-Supervised Denoising Diffusion Spatio-Spectral Model for Hyperspectral Image Restoration

Diffusion models have recently received a surge of interest due to their impressive performance for image restoration, especially in terms of noise robustness. However, existing diffusion-based methods are trained on a large amount of training data and perform very well in-distribution, but can be quite susceptible to distribution shift. This is especially inappropriate for data-starved hyperspectral image (HSI) restoration. To tackle this problem, this work puts forth a self-supervised diffusion model for HSI restoration, namely Denoising Diffusion Spatio-Spectral Model (DDS2M), which works by inferring the parameters of the proposed Variational Spatio-Spectral Module (VS2M) during the reverse diffusion process, solely using the degraded HSI without any extra training data. In VS2M, a variational inference-based loss function is customized to enable the untrained spatial and spectral networks to learn the posterior distribution, which serves as the transitions of the sampling chain to help reverse the diffusion process. Benefiting from its self-supervised nature and the diffusion process, DDS2M enjoys stronger generalization ability to various HSIs compared to existing diffusion-based methods and superior robustness to noise compared to existing HSI restoration methods. Extensive experiments on HSI denoising, noisy HSI completion and super-resolution on a variety of HSIs demonstrate DDS2M's superiority over the existing task-specific state-of-the-arts.

  • 4 authors
·
Mar 12, 2023

Interpretable structural model error discovery from sparse assimilation increments using spectral bias-reduced neural networks: A quasi-geostrophic turbulence test case

Earth system models suffer from various structural and parametric errors in their representation of nonlinear, multi-scale processes, leading to uncertainties in their long-term projections. The effects of many of these errors (particularly those due to fast physics) can be quantified in short-term simulations, e.g., as differences between the predicted and observed states (analysis increments). With the increase in the availability of high-quality observations and simulations, learning nudging from these increments to correct model errors has become an active research area. However, most studies focus on using neural networks, which while powerful, are hard to interpret, are data-hungry, and poorly generalize out-of-distribution. Here, we show the capabilities of Model Error Discovery with Interpretability and Data Assimilation (MEDIDA), a general, data-efficient framework that uses sparsity-promoting equation-discovery techniques to learn model errors from analysis increments. Using two-layer quasi-geostrophic turbulence as the test case, MEDIDA is shown to successfully discover various linear and nonlinear structural/parametric errors when full observations are available. Discovery from spatially sparse observations is found to require highly accurate interpolation schemes. While NNs have shown success as interpolators in recent studies, here, they are found inadequate due to their inability to accurately represent small scales, a phenomenon known as spectral bias. We show that a general remedy, adding a random Fourier feature layer to the NN, resolves this issue enabling MEDIDA to successfully discover model errors from sparse observations. These promising results suggest that with further development, MEDIDA could be scaled up to models of the Earth system and real observations.

  • 3 authors
·
Sep 22, 2023

Repeating fast radio bursts from synchrotron maser radiation in localized plasma blobs: Application to FRB 20121102A

The radiation physics of repeating fast radio bursts (FRBs) remains enigmatic. Motivated by the observed narrow-banded emission spectrum and ambiguous fringe pattern of the spectral peak frequency (nu_{rm pk}) distribution of some repeating FRBs, such as FRB 20121102A, we propose that the bursts from repeating FRBs arise from synchrotron maser radiation in localized blobs within weakly magnetized plasma that relativistically moves toward observers. Assuming the plasma moves toward the observers with a bulk Lorentz factor of Gamma=100 and the electron distribution in an individual blob is monoenergetic (gamma_{rm e}sim300), our analysis shows that bright and narrow-banded radio bursts with peak flux density sim 1 {rm Jy} at peak frequency (nu_{rm pk}) sim 3.85 GHz can be produced by the synchrotron maser emission if the plasma blob has a magnetization factor of sigmasim10^{-5} and a frequency of nu_{rm P}sim 4.5 MHz. The spectrum of bursts with lower nu_{rm pk} tends to be narrower. Applying our model to the bursts of FRB 20121102A, the distributions of both the observed nu_{rm pk} and isotropic energy E_{rm iso} detected by the Arecibo telescope at the L band and the Green Bank Telescope at the C band are successfully reproduced. We find that the nu_{rm P} distribution exhibits several peaks, similar to those observed in the nu_{rm pk} distribution of FRB 20121102A. This implies that the synchrotron maser emission in FRB 20121102A is triggered in different plasma blobs with varying nu_{rm P}, likely due to the inhomogeneity of relativistic electron number density.

  • 5 authors
·
Feb 16, 2025

Kolmogorov-Arnold Attention: Is Learnable Attention Better For Vision Transformers?

Kolmogorov-Arnold networks (KANs) are a remarkable innovation consisting of learnable activation functions with the potential to capture more complex relationships from data. Although KANs are useful in finding symbolic representations and continual learning of one-dimensional functions, their effectiveness in diverse machine learning (ML) tasks, such as vision, remains questionable. Presently, KANs are deployed by replacing multilayer perceptrons (MLPs) in deep network architectures, including advanced architectures such as vision Transformers (ViTs). In this paper, we are the first to design a general learnable Kolmogorov-Arnold Attention (KArAt) for vanilla ViTs that can operate on any choice of basis. However, the computing and memory costs of training them motivated us to propose a more modular version, and we designed particular learnable attention, called Fourier-KArAt. Fourier-KArAt and its variants either outperform their ViT counterparts or show comparable performance on CIFAR-10, CIFAR-100, and ImageNet-1K datasets. We dissect these architectures' performance and generalization capacity by analyzing their loss landscapes, weight distributions, optimizer path, attention visualization, and spectral behavior, and contrast them with vanilla ViTs. The goal of this paper is not to produce parameter- and compute-efficient attention, but to encourage the community to explore KANs in conjunction with more advanced architectures that require a careful understanding of learnable activations. Our open-source code and implementation details are available on: https://subhajitmaity.me/KArAt

  • 4 authors
·
Mar 13, 2025 3